Skip to content

Commit 56a3cc0

Browse files
committed
feat(deferred-allocate): Part 3: Defer allocate
This implements deferred allocaion that delays allocating space for nodes until commit time. The nodestore refactor consisted of removing the 'new' hashmap, which mapped newly allocated nodes to their contents in a proposal. Instead, now we store those newly allocated nodes directly from the `Child` struct. The implementation constisted of: - A new type, NodeAllocator, to consolidate all the allocation code - Removing the 'new' hashmap, as freshly allocatred nodes are found by traversing the trie and inspecting their state - Use the most recently committed revision's freelist to allocate new nodes. This will change in a future revision when the freelist is tied to storage rather than the revision. - Rewrite flush_nodes to serialize then allocate. Previously, the node was serialized twice, once to determine the size during proposal creation (allocation time) and once to store it - Changes to the commit flow by separating the node persistence logic (see comments in commit.rs) Some miscellaneous changes included in this diff (could be split out): - Branch children iterators can no longer return reference to hashes, since they are sometimes behind an ArcSwap - Propogate logger feature in fwdctl (helped with debugging) - Remove use of stored_len in tests - size_from_area_index was a method on nodestore, but didn't use nodestore - uncached_node_and_size was dead code - New clippy lint for hash_helper not needing &self in non-ethhash fixed Task completion status: [x] Roots may not be persisted (Part 1: #1041) [x] BranchNode children can now be Node, LinearAddress or a MaybePersistedNode (Part 2: #1045 and #1047) [x] When converting a `MutableProposal` to an `ImmutableProposal`, don't actually allocate space, just create `SharedNode`s from the `Node`s. (Part 3: #1055 and this PR) [ ] Remove `NodeStoreHeader` from `NodeStore`. The header should move into the `RevisionManager`. [ ] Allocate new nodes from the recently-freed nodes from an expired revision, then from the freelist. This can be done by maintaining a local freelist from the nodes deleted in the expiring revision and falling back to the actual freelist. Deleted nodes that are not reused must still be pushed onto the freelist.
1 parent 65a9ded commit 56a3cc0

File tree

16 files changed

+332
-417
lines changed

16 files changed

+332
-417
lines changed

firewood/src/manager.rs

Lines changed: 11 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -145,21 +145,10 @@ impl RevisionManager {
145145
/// It only contains the address of the nodes that are deleted, which should be very small.
146146
/// 3. Revision reaping. If more than the maximum number of revisions are kept in memory, the
147147
/// oldest revision is reaped.
148-
/// 4. Set last committed revision.
148+
/// 4. Persist to disk. This includes flushing everything to disk.
149+
/// 5. Set last committed revision.
149150
/// Set last committed revision in memory.
150-
/// Another commit can start after this but before the node flush is completed.
151-
/// 5. Free list flush.
152-
/// Persist/write the free list header.
153-
/// The free list is flushed first to prevent future allocations from using the space allocated to this proposal.
154-
/// This should be done in a single write since the free list headers are small, and must be persisted to disk before starting the next step.
155-
/// 6. Node flush.
156-
/// Persist/write all the nodes to disk.
157-
/// Note that since these are all freshly allocated nodes, they will never be referred to by any prior commit.
158-
/// After flushing all nodes, the file should be flushed to disk (fsync) before performing the next step.
159-
/// 7. Root move.
160-
/// The root address on disk must be updated.
161-
/// This write can be delayed, but would mean that recovery will not roll forward to this revision.
162-
/// 8. Proposal Cleanup.
151+
/// 6. Proposal Cleanup.
163152
/// Any other proposals that have this proposal as a parent should be reparented to the committed version.
164153
#[fastrace::trace(short_name = true)]
165154
#[crate::metrics("firewood.proposal.commit", "proposal commit to storage")]
@@ -170,7 +159,7 @@ impl RevisionManager {
170159
return Err(RevisionManagerError::NotLatest);
171160
}
172161

173-
let mut committed = proposal.as_committed();
162+
let mut committed = proposal.as_committed(current_revision);
174163

175164
// 2. Persist delete list for this committed revision to disk for recovery
176165

@@ -212,7 +201,12 @@ impl RevisionManager {
212201
gauge!("firewood.max_revisions").set(self.max_revisions as f64);
213202
}
214203

215-
// 4. Set last committed revision
204+
// 4. Persist to disk.
205+
// TODO: We can probably do this in another thread, but it requires that
206+
// we move the header out of NodeStore, which is in a future PR.
207+
committed.persist()?;
208+
209+
// 5. Set last committed revision
216210
let committed: CommittedRevision = committed.into();
217211
self.historical
218212
.write()
@@ -224,18 +218,8 @@ impl RevisionManager {
224218
.expect("poisoned lock")
225219
.insert(hash, committed.clone());
226220
}
227-
// TODO: We could allow other commits to start here using the pending list
228-
229-
// 5. Free list flush, which will prevent allocating on top of the nodes we are about to write
230-
proposal.flush_freelist()?;
231-
232-
// 6. Node flush
233-
proposal.flush_nodes()?;
234-
235-
// 7. Root move
236-
proposal.flush_header()?;
237221

238-
// 8. Proposal Cleanup
222+
// 6. Proposal Cleanup
239223
// Free proposal that is being committed as well as any proposals no longer
240224
// referenced by anyone else.
241225
self.proposals

firewood/src/proof.rs

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -106,11 +106,11 @@ impl Hashable for ProofNode {
106106
})
107107
}
108108

109-
fn children(&self) -> impl Iterator<Item = (usize, &HashType)> + Clone {
109+
fn children(&self) -> impl Iterator<Item = (usize, HashType)> + Clone {
110110
self.child_hashes
111111
.iter()
112112
.enumerate()
113-
.filter_map(|(i, hash)| hash.as_ref().map(|h| (i, h)))
113+
.filter_map(|(i, hash)| hash.as_ref().map(|h| (i, h.clone())))
114114
}
115115
}
116116

fwdctl/Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,7 @@ nonzero_ext = "0.3.0"
3838

3939
[features]
4040
ethhash = ["firewood/ethhash"]
41+
logger = ["firewood/logger"]
4142

4243
[dev-dependencies]
4344
# Workspace dependencies

storage/src/checker/mod.rs

Lines changed: 32 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,10 @@ mod range_set;
55
use range_set::LinearAddressRangeSet;
66

77
use crate::logger::warn;
8-
use crate::nodestore::alloc::{AREA_SIZES, AreaIndex, FreeAreaWithMetadata};
8+
use crate::nodestore::alloc::{AREA_SIZES, AreaIndex, FreeAreaWithMetadata, size_from_area_index};
99
use crate::{
1010
CheckerError, Committed, HashType, HashedNodeReader, LinearAddress, Node, NodeReader,
11-
NodeStore, Path, StoredAreaParent, TrieNodeParent, WritableStorage, hash_node,
11+
NodeStore, Path, RootReader, StoredAreaParent, TrieNodeParent, WritableStorage, hash_node,
1212
};
1313

1414
use std::cmp::Ordering;
@@ -42,19 +42,35 @@ impl<S: WritableStorage> NodeStore<Committed, S> {
4242
let mut visited = LinearAddressRangeSet::new(db_size)?;
4343

4444
// 2. traverse the trie and check the nodes
45-
if let (Some(root_address), Some(root_hash)) = (self.root_address(), self.root_hash()) {
45+
if let Some(root) = self.root_as_maybe_persisted_node() {
4646
// the database is not empty, traverse the trie
47-
self.check_area_aligned(
48-
root_address,
49-
StoredAreaParent::TrieNode(TrieNodeParent::Root),
50-
)?;
51-
self.visit_trie(
52-
root_address,
53-
HashType::from(root_hash),
54-
Path::new(),
55-
&mut visited,
56-
opt.hash_check,
57-
)?;
47+
if let Some(root_address) = root.as_linear_address() {
48+
self.check_area_aligned(
49+
root_address,
50+
StoredAreaParent::TrieNode(TrieNodeParent::Root),
51+
)?;
52+
// For hash checking, we need the root hash if it exists
53+
#[cfg_attr(not(feature = "ethhash"), expect(clippy::useless_conversion))]
54+
let root_hash = self.root_hash().map(HashType::from);
55+
if let Some(root_hash) = root_hash {
56+
self.visit_trie(
57+
root_address,
58+
root_hash,
59+
Path::new(),
60+
&mut visited,
61+
opt.hash_check,
62+
)?;
63+
} else {
64+
// No root hash available, visit without hash checking
65+
self.visit_trie(
66+
root_address,
67+
HashType::default(), // dummy hash
68+
Path::new(),
69+
&mut visited,
70+
false, // can't check hashes without root hash
71+
)?;
72+
}
73+
}
5874
}
5975

6076
// 3. check the free list - this can happen in parallel with the trie traversal
@@ -71,7 +87,7 @@ impl<S: WritableStorage> NodeStore<Committed, S> {
7187
Ok(())
7288
}
7389

74-
/// Recursively traverse the trie from the given root address.
90+
/// Recursively traverse the trie from the given root node.
7591
fn visit_trie(
7692
&self,
7793
subtree_root_address: LinearAddress,
@@ -134,7 +150,7 @@ impl<S: WritableStorage> NodeStore<Committed, S> {
134150
parent,
135151
} = free_area?;
136152
self.check_area_aligned(addr, StoredAreaParent::FreeList(parent))?;
137-
let area_size = Self::size_from_area_index(area_index);
153+
let area_size = size_from_area_index(area_index);
138154
if free_list_id != area_index {
139155
return Err(CheckerError::FreelistAreaSizeMismatch {
140156
address: addr,

storage/src/hashednode.rs

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -124,7 +124,7 @@ pub trait Hashable {
124124
fn value_digest(&self) -> Option<ValueDigest<&[u8]>>;
125125
/// Each element is a child's index and hash.
126126
/// Yields 0 elements if the node is a leaf.
127-
fn children(&self) -> impl Iterator<Item = (usize, &HashType)> + Clone;
127+
fn children(&self) -> impl Iterator<Item = (usize, HashType)> + Clone;
128128
}
129129

130130
/// A preimage of a hash.
@@ -138,7 +138,7 @@ pub trait Preimage {
138138
trait HashableNode {
139139
fn partial_path(&self) -> impl Iterator<Item = u8> + Clone;
140140
fn value(&self) -> Option<&[u8]>;
141-
fn children_iter(&self) -> impl Iterator<Item = (usize, &HashType)> + Clone;
141+
fn children_iter(&self) -> impl Iterator<Item = (usize, HashType)> + Clone;
142142
}
143143

144144
impl HashableNode for BranchNode {
@@ -150,7 +150,7 @@ impl HashableNode for BranchNode {
150150
self.value.as_deref()
151151
}
152152

153-
fn children_iter(&self) -> impl Iterator<Item = (usize, &HashType)> + Clone {
153+
fn children_iter(&self) -> impl Iterator<Item = (usize, HashType)> + Clone {
154154
self.children_hashes()
155155
}
156156
}
@@ -164,7 +164,7 @@ impl HashableNode for LeafNode {
164164
Some(&self.value)
165165
}
166166

167-
fn children_iter(&self) -> impl Iterator<Item = (usize, &HashType)> + Clone {
167+
fn children_iter(&self) -> impl Iterator<Item = (usize, HashType)> + Clone {
168168
iter::empty()
169169
}
170170
}
@@ -198,7 +198,7 @@ impl<'a, N: HashableNode> Hashable for NodeAndPrefix<'a, N> {
198198
self.node.value().map(ValueDigest::Value)
199199
}
200200

201-
fn children(&self) -> impl Iterator<Item = (usize, &HashType)> + Clone {
201+
fn children(&self) -> impl Iterator<Item = (usize, HashType)> + Clone {
202202
self.node.children_iter()
203203
}
204204
}

storage/src/hashers/ethhash.rs

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -173,7 +173,7 @@ impl<T: Hashable> Preimage for T {
173173
let mut rlp = RlpStream::new_list(17);
174174
let mut child_iter = self.children().peekable();
175175
for index in 0..=15 {
176-
if let Some(&(child_index, digest)) = child_iter.peek() {
176+
if let Some(&(child_index, ref digest)) = child_iter.peek() {
177177
if child_index == index {
178178
match digest {
179179
HashType::Hash(hash) => rlp.append(&hash.as_slice()),
@@ -225,7 +225,7 @@ impl<T: Hashable> Preimage for T {
225225
self.partial_path().collect::<Box<_>>(),
226226
true,
227227
));
228-
rlp.append_raw(rlp_bytes, 1);
228+
rlp.append_raw(&rlp_bytes, 1);
229229
let bytes = rlp.out();
230230
TrieHash::from(Keccak256::digest(bytes))
231231
}

storage/src/lib.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ pub use node::{
4747
};
4848
pub use nodestore::{
4949
Committed, HashedNodeReader, ImmutableProposal, LinearAddress, MutableProposal, NodeReader,
50-
NodeStore, Parentable, ReadInMemoryNode, RootReader, TrieReader,
50+
NodeStore, Parentable, RootReader, TrieReader,
5151
};
5252

5353
pub use linear::filebacked::FileBacked;

storage/src/linear/filebacked.rs

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@
2525

2626
use std::fs::{File, OpenOptions};
2727
use std::io::Read;
28-
use std::num::NonZero;
28+
use std::num::{NonZero, NonZeroU64};
2929
#[cfg(unix)]
3030
use std::os::unix::fs::FileExt;
3131
#[cfg(windows)]
@@ -212,13 +212,13 @@ impl WritableStorage for FileBacked {
212212
}
213213
}
214214

215-
fn write_cached_nodes<'a>(
215+
fn write_cached_nodes(
216216
&self,
217-
nodes: impl Iterator<Item = (LinearAddress, &'a SharedNode)>,
217+
nodes: impl Iterator<Item = (NonZeroU64, SharedNode)>,
218218
) -> Result<(), FileIoError> {
219219
let mut guard = self.cache.lock().expect("poisoned lock");
220220
for (addr, node) in nodes {
221-
guard.put(addr, node.clone());
221+
guard.put(addr.into(), node);
222222
}
223223
Ok(())
224224
}

storage/src/linear/mod.rs

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@
1919
reason = "Found 4 occurrences after enabling the lint."
2020
)]
2121

22-
use std::fmt::Debug;
22+
use std::{fmt::Debug, num::NonZero};
2323
use std::io::Read;
2424
use std::ops::Deref;
2525
use std::path::PathBuf;
@@ -181,9 +181,9 @@ pub trait WritableStorage: ReadableStorage {
181181
fn write(&self, offset: u64, object: &[u8]) -> Result<usize, FileIoError>;
182182

183183
/// Write all nodes to the cache (if any)
184-
fn write_cached_nodes<'a>(
184+
fn write_cached_nodes(
185185
&self,
186-
_nodes: impl Iterator<Item = (LinearAddress, &'a SharedNode)>,
186+
_nodes: impl Iterator<Item = (NonZero<u64>, SharedNode)>,
187187
) -> Result<(), FileIoError> {
188188
Ok(())
189189
}

storage/src/node/branch.rs

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -412,8 +412,13 @@ impl BranchNode {
412412
}
413413

414414
/// Returns (index, hash) for each child that has a hash set.
415-
pub fn children_hashes(&self) -> impl Iterator<Item = (usize, &HashType)> + Clone {
416-
self.children_iter().map(|(idx, (_, hash))| (idx, hash))
415+
pub fn children_hashes(&self) -> impl Iterator<Item = (usize, HashType)> + Clone {
416+
self.children.iter().enumerate().filter_map(|(i, child)| {
417+
child
418+
.as_ref()
419+
.and_then(|child| child.hash().cloned())
420+
.map(|hash| (i, hash))
421+
})
417422
}
418423

419424
/// Returns (index, address) for each child that has a hash set.

0 commit comments

Comments
 (0)