Skip to content

Conversation

rjl493456442
Copy link
Member

@rjl493456442 rjl493456442 commented Feb 11, 2025

This pull request is part-1 for shipping the core part of archive node over path mode.
These following things have been implemented:

  • state history index definition
  • state history indexer
  • state history reader

@rjl493456442 rjl493456442 force-pushed the pbss-archive-p1 branch 4 times, most recently from b661a71 to 53691ee Compare February 13, 2025 05:56
@rjl493456442 rjl493456442 force-pushed the pbss-archive-p1 branch 2 times, most recently from e4ac9f4 to a12c5a2 Compare May 28, 2025 12:01
min uint64 // The minimum state ID retained within the block
max uint64 // The maximum state ID retained within the block
entries uint32 // The number of state mutation records retained within the block
id uint32 // The id of the index block
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get rid of the fields of id and min?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can get easily rid of min and its also possible to not store id, I think. Since id is part of the key, we could just add that to the entry when its loaded and not store it in the db

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would keep the ID for a while, reasons:

It's technically possible to resolve the IDs from the database key by iterating the database. But

  • it's IO expensive
  • it's not robust, e.g., the stale index blocks will also be scanned
  • it's fairly easy to purge the existing indexes and regenerate them later

func (b *batchIndexer) process(h *history, historyID uint64) error {
for _, address := range h.accountList {
b.counter += 1
b.accounts[address] = append(b.accounts[address], historyID)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just thinking out loud, the whole following code depends on the historyIDs to be sorted. So if we ever called process with an out of order historyID, this would not work anymore. Should we make sure before writing out the lists, that they are sorted?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The histories are resolved from the freezer in batch. I think the assumption is held that histories are processed in order.

@MariusVanDerWijden
Copy link
Member

Are we loosing something if we use the account hash over the account address?

if err == nil {
return
}
if errors.Is(err, ethdb.ErrTooManyKeys) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this logic, if the range delete errors with tooManyKeys, we will continuously try to delete it. Wouldn't it always error with TooManyKeys then?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, each call will make some progress by deleting items. Ultimately it will remove all the items in the range.

@rjl493456442
Copy link
Member Author

Totat storage size of a fully-sych'd archive node is around 1.9TB

gary@dev:~/hdd2$ du -sh geth-ancient-mainnet-archive/
1.5T    geth-ancient-mainnet-archive/
gary@dev:~/hdd2$ du -sh geth-ancient-mainnet-archive/ancient/chain/
921G    geth-ancient-mainnet-archive/ancient/chain/
gary@dev:~/hdd2$ du -sh geth-ancient-mainnet-archive/ancient/state
539G    geth-ancient-mainnet-archive/ancient/state

gary@dev:~$ du -sh mount/geth/geth
413G    mount/geth/geth
+-----------------------+-----------------------------+------------+------------+
|       DATABASE        |          CATEGORY           |    SIZE    |   ITEMS    |
+-----------------------+-----------------------------+------------+------------+
| Key-Value store       | Headers                     | 2.37 MiB   |       3640 |
| Key-Value store       | Bodies                      | 333.02 MiB |       3640 |
| Key-Value store       | Receipt lists               | 316.00 MiB |       3639 |
| Key-Value store       | Difficulties (deprecated)   | 0.00 B     |          0 |
| Key-Value store       | Block number->hash          | 149.26 KiB |       3639 |
| Key-Value store       | Block hash->number          | 888.96 MiB |   22735156 |
| Key-Value store       | Transaction index           | 13.84 GiB  |  401645299 |
| Key-Value store       | Log index filter-map rows   | 0.00 B     |          0 |
| Key-Value store       | Log index last-block-of-map | 0.00 B     |          0 |
| Key-Value store       | Log index block-lv          | 0.00 B     |          0 |
| Key-Value store       | Log bloombits (deprecated)  | 0.00 B     |          0 |
| Key-Value store       | Contract codes              | 10.19 GiB  |    1699192 |
| Key-Value store       | Hash trie nodes             | 0.00 B     |          0 |
| Key-Value store       | Path trie state lookups     | 888.86 MiB |   22732693 |
| Key-Value store       | Path trie account nodes     | 47.26 GiB  |  409852077 |
| Key-Value store       | Path trie storage nodes     | 179.98 GiB | 1791011965 |
| Key-Value store       | Path state history indexes  | 290.62 GiB | 4112383456 |
| Key-Value store       | Verkle trie nodes           | 0.00 B     |          0 |
| Key-Value store       | Verkle trie state lookups   | 0.00 B     |          0 |
| Key-Value store       | Trie preimages              | 2.07 MiB   |      31025 |
| Key-Value store       | Account snapshot            | 13.75 GiB  |  299285136 |
| Key-Value store       | Storage snapshot            | 95.42 GiB  | 1322797351 |
| Key-Value store       | Beacon sync headers         | 18.25 MiB  |      29397 |
| Key-Value store       | Clique snapshots            | 0.00 B     |          0 |
| Key-Value store       | Singleton metadata          | 202.94 MiB |         15 |
| Ancient store (Chain) | Headers                     | 10.92 GiB  |   22731517 |
| Ancient store (Chain) | Hashes                      | 823.78 MiB |   22731517 |
| Ancient store (Chain) | Bodies                      | 655.40 GiB |   22731517 |
| Ancient store (Chain) | Receipts                    | 253.37 GiB |   22731517 |
| Ancient store (State) | Storage.Index               | 203.80 GiB |   22732692 |
| Ancient store (State) | Account.Data                | 141.52 GiB |   22732692 |
| Ancient store (State) | Storage.Data                | 50.38 GiB  |   22732692 |
| Ancient store (State) | History.Meta                | 1.67 GiB   |   22732692 |
| Ancient store (State) | Account.Index               | 141.45 GiB |   22732692 |
+-----------------------+-----------------------------+------------+------------+
|                                    TOTAL            |  2.06 TIB  |            |
+-----------------------+-----------------------------+------------+------------+

@fjl fjl added this to the 1.15.12 milestone Jun 24, 2025
@fjl fjl merged commit 9c5c0e3 into ethereum:master Jun 24, 2025
3 of 4 checks passed
rjl493456442 added a commit to rjl493456442/go-ethereum that referenced this pull request Jul 19, 2025
This pull request is part-1 for shipping the core part of archive node
in PBSS mode.
howjmay pushed a commit to iotaledger/go-ethereum that referenced this pull request Aug 27, 2025
This pull request is part-1 for shipping the core part of archive node
in PBSS mode.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants