Skip to content

Commit efbac3c

Browse files
committed
Implement background promotion and eviction
and add additional parameters to control allocation and eviction of items.
1 parent 3c34254 commit efbac3c

39 files changed

+1894
-40
lines changed

MultiTierDataMovement.md

Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
# Background Data Movement
2+
3+
In order to reduce the number of online evictions and support asynchronous
4+
promotion - we have added two periodic workers to handle eviction and promotion.
5+
6+
The diagram below shows a simplified version of how the background evictor
7+
thread (green) is integrated to the CacheLib architecture.
8+
9+
<p align="center">
10+
<img width="640" height="360" alt="BackgroundEvictor" src="cachelib-background-evictor.png">
11+
</p>
12+
13+
## Synchronous Eviction and Promotion
14+
15+
- `disableEvictionToMemory`: Disables eviction to memory (item is always evicted to NVMe or removed
16+
on eviction)
17+
18+
## Background Evictors
19+
20+
The background evictors scan each class to see if there are objects to move the next (higher)
21+
tier using a given strategy. Here we document the parameters for the different
22+
strategies and general parameters.
23+
24+
- `backgroundEvictorIntervalMilSec`: The interval that this thread runs for - by default
25+
the background evictor threads will wake up every 10 ms to scan the AllocationClasses. Also,
26+
the background evictor thead will be woken up everytime there is a failed allocation (from
27+
a request handling thread) and the current percentage of free memory for the
28+
AllocationClass is lower than `lowEvictionAcWatermark`. This may render the interval parameter
29+
not as important when there are many allocations occuring from request handling threads.
30+
31+
- `evictorThreads`: The number of background evictors to run - each thread is a assigned
32+
a set of AllocationClasses to scan and evict objects from. Currently, each thread gets
33+
an equal number of classes to scan - but as object size distribution may be unequal - future
34+
versions will attempt to balance the classes among threads. The range is 1 to number of AllocationClasses.
35+
The default is 1.
36+
37+
- `evictionHotnessThreshold`: The number of objects to remove in a given eviction call. The
38+
default is 40. Lower range is 10 and the upper range is 1000. Too low and we might not
39+
remove objects at a reasonable rate, too high and we hold the locks for copying data
40+
between tiers for too long.
41+
42+
43+
### FreeThresholdStrategy (default)
44+
45+
- `lowEvictionAcWatermark`: Triggers background eviction thread to run
46+
when this percentage of the AllocationClass is free.
47+
The default is `2.0`, to avoid wasting capacity we don't set this above `10.0`.
48+
49+
- `highEvictionAcWatermark`: Stop the evictions from an AllocationClass when this
50+
percentage of the AllocationClass is free. The default is `5.0`, to avoid wasting capacity we
51+
don't set this above `10`.
52+
53+
54+
## Background Promoters
55+
56+
The background promotes scan each class to see if there are objects to move to a lower
57+
tier using a given strategy. Here we document the parameters for the different
58+
strategies and general parameters.
59+
60+
- `backgroundPromoterIntervalMilSec`: The interval that this thread runs for - by default
61+
the background promoter threads will wake up every 10 ms to scan the AllocationClasses for
62+
objects to promote.
63+
64+
- `promoterThreads`: The number of background promoters to run - each thread is a assigned
65+
a set of AllocationClasses to scan and promote objects from. Currently, each thread gets
66+
an equal number of classes to scan - but as object size distribution may be unequal - future
67+
versions will attempt to balance the classes among threads. The range is `1` to number of AllocationClasses. The default is `1`.
68+
69+
- `evictionHotnessThreshold`: The number of objects to remove in a given eviction call. The
70+
default is 50. Lower range is 10 and the upper range is 1000. Too low and we might not
71+
remove objects at a reasonable rate, too high and we hold the locks for copying data
72+
between tiers for too long.
73+
74+
- `numDuplicateElements`: This allows us to promote items that have existing handles (read-only) since
75+
we won't need to modify the data when a user is done with the data. Therefore, for a short time
76+
the data could reside in both tiers until it is evicted from its current tier. The default is to
77+
not allow this (0). Setting the value to 100 will enable duplicate elements in tiers.
78+
79+
### Background Promotion Strategy (only one currently)
80+
81+
- `promotionAcWatermark`: Promote items if there is at least this
82+
percent of free AllocationClasses. Promotion thread will attempt to move `evictionHotnessThreshold` number of objects
83+
to that tier. The objects are chosen from the head of the LRU. The default is `4.0`.
84+
This value should correlate with `lowEvictionAcWatermark`, `highEvictionAcWatermark`, `minAcAllocationWatermark`, `maxAcAllocationWatermark`.
85+
- `promotionHotnessThreshold`: The number of objects to promote in batch during BG promotion. Analogous to
86+
`evictionHotnessThreshold`. It's value should be lower to decrease contention on hot items.
87+
88+
## Allocation policies
89+
90+
- `maxAcAllocationWatermark`: Item is always allocated in topmost tier if at least this
91+
percentage of the AllocationClass is free.
92+
- `minAcAllocationWatermark`: Item is always allocated in bottom tier if only this percent
93+
of the AllocationClass is free. If percentage of free AllocationClasses is between `maxAcAllocationWatermark`
94+
and `minAcAllocationWatermark`: then extra checks (described below) are performed to decide where to put the element.
95+
96+
By default, allocation will always be performed from the upper tier.
97+
98+
### Extra policies (used only when percentage of free AllocationClasses is between `maxAcAllocationWatermark`
99+
and `minAcAllocationWatermark`)
100+
- `sizeThresholdPolicy`: If item is smaller than this value, always allocate it in upper tier.
101+
- `defaultTierChancePercentage`: Change (0-100%) of allocating item in top tier
102+
103+
## MMContainer options
104+
105+
- `lruInsertionPointSpec`: Can be set per tier when LRU2Q is used. Determines where new items are
106+
inserted. 0 = insert to hot queue, 1 = insert to warm queue, 2 = insert to cold queue
107+
- `markUsefulChance`: Per-tier, determines chance of moving item to the head of LRU on access

cachelib-background-evictor.png

57.5 KB
Loading
Lines changed: 115 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,115 @@
1+
/*
2+
* Copyright (c) Intel and its affiliates.
3+
*
4+
* Licensed under the Apache License, Version 2.0 (the "License");
5+
* you may not use this file except in compliance with the License.
6+
* You may obtain a copy of the License at
7+
*
8+
* http://www.apache.org/licenses/LICENSE-2.0
9+
*
10+
* Unless required by applicable law or agreed to in writing, software
11+
* distributed under the License is distributed on an "AS IS" BASIS,
12+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
* See the License for the specific language governing permissions and
14+
* limitations under the License.
15+
*/
16+
17+
18+
19+
namespace facebook {
20+
namespace cachelib {
21+
22+
23+
template <typename CacheT>
24+
BackgroundEvictor<CacheT>::BackgroundEvictor(Cache& cache,
25+
std::shared_ptr<BackgroundEvictorStrategy> strategy)
26+
: cache_(cache),
27+
strategy_(strategy)
28+
{
29+
}
30+
31+
template <typename CacheT>
32+
BackgroundEvictor<CacheT>::~BackgroundEvictor() { stop(std::chrono::seconds(0)); }
33+
34+
template <typename CacheT>
35+
void BackgroundEvictor<CacheT>::work() {
36+
try {
37+
checkAndRun();
38+
} catch (const std::exception& ex) {
39+
XLOGF(ERR, "BackgroundEvictor interrupted due to exception: {}", ex.what());
40+
}
41+
}
42+
43+
template <typename CacheT>
44+
void BackgroundEvictor<CacheT>::setAssignedMemory(std::vector<std::tuple<TierId, PoolId, ClassId>> &&assignedMemory)
45+
{
46+
XLOG(INFO, "Class assigned to background worker:");
47+
for (auto [tid, pid, cid] : assignedMemory) {
48+
XLOGF(INFO, "Tid: {}, Pid: {}, Cid: {}", tid, pid, cid);
49+
}
50+
51+
mutex.lock_combine([this, &assignedMemory]{
52+
this->assignedMemory_ = std::move(assignedMemory);
53+
});
54+
}
55+
56+
// Look for classes that exceed the target memory capacity
57+
// and return those for eviction
58+
template <typename CacheT>
59+
void BackgroundEvictor<CacheT>::checkAndRun() {
60+
auto assignedMemory = mutex.lock_combine([this]{
61+
return assignedMemory_;
62+
});
63+
64+
unsigned int evictions = 0;
65+
std::set<ClassId> classes{};
66+
67+
for (const auto [tid, pid, cid] : assignedMemory) {
68+
classes.insert(cid);
69+
const auto& mpStats = cache_.getPoolByTid(pid,tid).getStats();
70+
71+
auto batch = strategy_->calculateBatchSize(cache_,tid,pid,cid, cache_.acAllocSize(tid, pid, cid), cache_.acMemorySize(tid, pid, cid));
72+
if (!batch) {
73+
continue;
74+
}
75+
76+
stats.evictionSize.add(batch * mpStats.acStats.at(cid).allocSize);
77+
78+
//try evicting BATCH items from the class in order to reach free target
79+
auto evicted =
80+
BackgroundEvictorAPIWrapper<CacheT>::traverseAndEvictItems(cache_,
81+
tid,pid,cid,batch);
82+
evictions += evicted;
83+
84+
const size_t cid_id = (size_t)mpStats.acStats.at(cid).allocSize;
85+
auto it = evictions_per_class_.find(cid_id);
86+
if (it != evictions_per_class_.end()) {
87+
it->second += evicted;
88+
} else {
89+
evictions_per_class_[cid_id] = 0;
90+
}
91+
}
92+
93+
stats.numTraversals.inc();
94+
stats.numEvictedItems.add(evictions);
95+
stats.totalClasses.add(classes.size());
96+
}
97+
98+
template <typename CacheT>
99+
BackgroundEvictionStats BackgroundEvictor<CacheT>::getStats() const noexcept {
100+
BackgroundEvictionStats evicStats;
101+
evicStats.numEvictedItems = stats.numEvictedItems.get();
102+
evicStats.runCount = stats.numTraversals.get();
103+
evicStats.evictionSize = stats.evictionSize.get();
104+
evicStats.totalClasses = stats.totalClasses.get();
105+
106+
return evicStats;
107+
}
108+
109+
template <typename CacheT>
110+
std::map<uint32_t,uint64_t> BackgroundEvictor<CacheT>::getClassStats() const noexcept {
111+
return evictions_per_class_;
112+
}
113+
114+
} // namespace cachelib
115+
} // namespace facebook
Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
/*
2+
* Copyright (c) Intel and its affiliates.
3+
*
4+
* Licensed under the Apache License, Version 2.0 (the "License");
5+
* you may not use this file except in compliance with the License.
6+
* You may obtain a copy of the License at
7+
*
8+
* http://www.apache.org/licenses/LICENSE-2.0
9+
*
10+
* Unless required by applicable law or agreed to in writing, software
11+
* distributed under the License is distributed on an "AS IS" BASIS,
12+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
* See the License for the specific language governing permissions and
14+
* limitations under the License.
15+
*/
16+
17+
#pragma once
18+
19+
#include <gtest/gtest_prod.h>
20+
#include <folly/concurrency/UnboundedQueue.h>
21+
22+
#include "cachelib/allocator/CacheStats.h"
23+
#include "cachelib/common/PeriodicWorker.h"
24+
#include "cachelib/allocator/BackgroundEvictorStrategy.h"
25+
#include "cachelib/common/AtomicCounter.h"
26+
27+
28+
namespace facebook {
29+
namespace cachelib {
30+
31+
// wrapper that exposes the private APIs of CacheType that are specifically
32+
// needed for the eviction.
33+
template <typename C>
34+
struct BackgroundEvictorAPIWrapper {
35+
36+
static size_t traverseAndEvictItems(C& cache,
37+
unsigned int tid, unsigned int pid, unsigned int cid, size_t batch) {
38+
return cache.traverseAndEvictItems(tid,pid,cid,batch);
39+
}
40+
};
41+
42+
struct BackgroundEvictorStats {
43+
// items evicted
44+
AtomicCounter numEvictedItems{0};
45+
46+
// traversals
47+
AtomicCounter numTraversals{0};
48+
49+
// total class size
50+
AtomicCounter totalClasses{0};
51+
52+
// item eviction size
53+
AtomicCounter evictionSize{0};
54+
};
55+
56+
// Periodic worker that evicts items from tiers in batches
57+
// The primary aim is to reduce insertion times for new items in the
58+
// cache
59+
template <typename CacheT>
60+
class BackgroundEvictor : public PeriodicWorker {
61+
public:
62+
using Cache = CacheT;
63+
// @param cache the cache interface
64+
// @param target_free the target amount of memory to keep free in
65+
// this tier
66+
// @param tier id memory tier to perform eviction on
67+
BackgroundEvictor(Cache& cache,
68+
std::shared_ptr<BackgroundEvictorStrategy> strategy);
69+
70+
~BackgroundEvictor() override;
71+
72+
BackgroundEvictionStats getStats() const noexcept;
73+
std::map<uint32_t,uint64_t> getClassStats() const noexcept;
74+
75+
void setAssignedMemory(std::vector<std::tuple<TierId, PoolId, ClassId>> &&assignedMemory);
76+
77+
private:
78+
std::map<uint32_t,uint64_t> evictions_per_class_;
79+
80+
// cache allocator's interface for evicting
81+
82+
using Item = typename Cache::Item;
83+
84+
Cache& cache_;
85+
std::shared_ptr<BackgroundEvictorStrategy> strategy_;
86+
87+
// implements the actual logic of running the background evictor
88+
void work() override final;
89+
void checkAndRun();
90+
91+
BackgroundEvictorStats stats;
92+
93+
std::vector<std::tuple<TierId, PoolId, ClassId>> assignedMemory_;
94+
folly::DistributedMutex mutex;
95+
};
96+
} // namespace cachelib
97+
} // namespace facebook
98+
99+
#include "cachelib/allocator/BackgroundEvictor-inl.h"
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
/*
2+
* Copyright (c) Facebook, Inc. and its affiliates.
3+
*
4+
* Licensed under the Apache License, Version 2.0 (the "License");
5+
* you may not use this file except in compliance with the License.
6+
* You may obtain a copy of the License at
7+
*
8+
* http://www.apache.org/licenses/LICENSE-2.0
9+
*
10+
* Unless required by applicable law or agreed to in writing, software
11+
* distributed under the License is distributed on an "AS IS" BASIS,
12+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
* See the License for the specific language governing permissions and
14+
* limitations under the License.
15+
*/
16+
17+
#pragma once
18+
19+
#include "cachelib/allocator/Cache.h"
20+
21+
namespace facebook {
22+
namespace cachelib {
23+
24+
25+
// Base class for background eviction strategy.
26+
class BackgroundEvictorStrategy {
27+
28+
public:
29+
virtual size_t calculateBatchSize(const CacheBase& cache,
30+
unsigned int tid,
31+
PoolId pid,
32+
ClassId cid,
33+
size_t allocSize,
34+
size_t acMemorySize) = 0;
35+
};
36+
37+
} // namespace cachelib
38+
} // namespace facebook

0 commit comments

Comments
 (0)