Spill to disk may cause data duplication

In aggressive spill-to-disk scenarios I observed that distributed may spill all the data it has in memory while still complaining with the following message that there is no more data to spill

> Memory use is high but worker has no data "
                        "to store to disk.  Perhaps some other process "
                        "is leaking memory?
Side note: In our setup every worker runs in an isolated container so the chance of another process interfering with it is virtually zero.

It is true that these workers do indeed still hold on to a lot of memory without it being very transparent about where the data is. GC is useless, i.e. it is **not** related to https://github.com/dask/zict/issues/19

Investigating this issue let me realise that we may try to spill data which is actually currently in use.
The most common example for this is probably once the worker collected the dependencies and schedules the execution of the task, see [here](https://github.com/dask/distributed/blob/c8ee44bebe40e4db4495be1e2e0340fafbb41a37/distributed/worker.py#L2485). Another one would be when data is [requested form a worker](https://github.com/dask/distributed/blob/c8ee44bebe40e4db4495be1e2e0340fafbb41a37/distributed/worker.py#L1232) and we're still serializing/submitting it.
I could _prove_ this assumption by patching the worker code, tracking the keys in-execution/in-spilling with some logging and it turns out that for some jobs I hold on to multiple GBs of memory although it was supposedly already spilled.

If we spill data which is currently still in use, this is not only misleading to the user, since the data is still in memory, but it may also cause heavy data duplication if the spilled-but-still-in-memory dependency is requested by another worker since the original data is still in memory but the buffer would fetch the key from the slow store and materialise it a second time since it doesn't know of the original object anymore. If this piece of data is requested by more than one worker, the data could be duplicated multiple times even.

In non-aggressive spill-to-disk scenarios we should be protected from this by the LRU in the buffer but if memory pressure is high, the spilling might actually worsen the situation in these cases.

My practical approach to this would be to introduce something like a `(un)lock_key_in_store` method to the buffer which protects the key from spilling and manually (un)setting this in the distributed code. If there is a smarter approach, I'd be glad to hear about it.

Also, If my reasoning is somewhere flawed, I'd appreciate feedback since, so far, I could only prove that we try to spill data currently in use but, so far, the duplication is just theory.

Related issues:
https://github.com/dask/dask/issues/2456

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Spill to disk may cause data duplication #3756

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Spill to disk may cause data duplication #3756

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions