Skip to content

afterWrite blocks calling thread #90

@anuraaga

Description

@anuraaga

We've been looking at an issue where sometimes CPU usage in our server binary spikes up and starts failing requests. We're not 100% sure the root cause is in caffeine but do see something somewhat unusual in thread dump.

There are many threads waiting here

"armeria-server-epoll-13-46" #216 prio=5 os_prio=0 tid=0x00007f274c012000 nid=0x144b7 runnable [0x00007f2948c2c000]
   java.lang.Thread.State: RUNNABLE
    at java.lang.Thread.yield(Native Method)
    at com.github.benmanes.caffeine.cache.BoundedLocalCache.afterWrite(BoundedLocalCache.java:914)
    at com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:1966)
    at com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:1893)
    at com.github.benmanes.caffeine.cache.LocalAsyncLoadingCache.get(LocalAsyncLoadingCache.java:125)
    at com.github.benmanes.caffeine.cache.LocalAsyncLoadingCache.get(LocalAsyncLoadingCache.java:117)
    at com.github.benmanes.caffeine.cache.LocalAsyncLoadingCache.get(LocalAsyncLoadingCache.java:156)

We run our logic on a small number of event loop threads, not on a massively threaded server, so it seems problematic for caffeine to be blocking our event loop threads for this write buffering. Is it possible to make this asynchronous? Alternatively, it seems like there is a (writeBuffers()) flag but I can't find how to modify it, a pointer would be great.

If it helps, the use case is a cache where 95% of the items will be accessed very frequently and should generally stay in the cache, while there will be LRU thrashing in the remaining as the actual data size is about 100x bigger than the cache.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions