-
Notifications
You must be signed in to change notification settings - Fork 21.2k
core/rawdb, triedb/pathdb: implement history indexer #31156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
b661a71
to
53691ee
Compare
e4ac9f4
to
a12c5a2
Compare
min uint64 // The minimum state ID retained within the block | ||
max uint64 // The maximum state ID retained within the block | ||
entries uint32 // The number of state mutation records retained within the block | ||
id uint32 // The id of the index block |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we get rid of the fields of id
and min
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can get easily rid of min and its also possible to not store id, I think. Since id is part of the key, we could just add that to the entry when its loaded and not store it in the db
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would keep the ID for a while, reasons:
It's technically possible to resolve the IDs from the database key by iterating the database. But
- it's IO expensive
- it's not robust, e.g., the stale index blocks will also be scanned
- it's fairly easy to purge the existing indexes and regenerate them later
triedb/pathdb/history_indexer.go
Outdated
func (b *batchIndexer) process(h *history, historyID uint64) error { | ||
for _, address := range h.accountList { | ||
b.counter += 1 | ||
b.accounts[address] = append(b.accounts[address], historyID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just thinking out loud, the whole following code depends on the historyIDs to be sorted. So if we ever called process with an out of order historyID, this would not work anymore. Should we make sure before writing out the lists, that they are sorted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The histories are resolved from the freezer in batch. I think the assumption is held that histories are processed in order.
Are we loosing something if we use the account hash over the account address? |
if err == nil { | ||
return | ||
} | ||
if errors.Is(err, ethdb.ErrTooManyKeys) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this logic, if the range delete errors with tooManyKeys, we will continuously try to delete it. Wouldn't it always error with TooManyKeys then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, each call will make some progress by deleting items. Ultimately it will remove all the items in the range.
9b9f23d
to
3ee9879
Compare
3ee9879
to
14561c6
Compare
Totat storage size of a fully-sych'd archive node is around 1.9TB
|
This pull request is part-1 for shipping the core part of archive node in PBSS mode.
This pull request is part-1 for shipping the core part of archive node in PBSS mode.
This pull request is part-1 for shipping the core part of archive node over path mode.
These following things have been implemented: