Skip to content

draft: branch for partial messages#8066

Draft
agnxsh wants to merge 15 commits intounstablefrom
agnxsh/pmes
Draft

draft: branch for partial messages#8066
agnxsh wants to merge 15 commits intounstablefrom
agnxsh/pmes

Conversation

@agnxsh
Copy link
Copy Markdown
Contributor

@agnxsh agnxsh commented Mar 11, 2026

tasks:

  • make a partial column quarantine to cache headers as they come in via gossip
  • that quarantine should also be able to handle PartialColumnDataSidecar until the adequate no. of cells are hit, then assemble them into DataColumnSidecar
  • write tests for both of these cases
  • write the new datatypes and gossip validation conditions
  • make these assemble sidecars ingest into ColumnQuaratine
  • write a backward compatible broadcast mechanism to handle partial and data column sidecar, can add a flag maybe
  • investigate if there's anything left in the Nimbus Eth2 <-> Nim Libp2p layer in terms of orchestration/compatibility
  • manage eth2processor to deal with partial columns and then on requisite no of them populate data column quarantine
  • handle the partial to full column translation in the getBlobs service as well

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 11, 2026

Unit Test Results

       12 files  ±    0    2 492 suites  +12   49m 4s ⏱️ +57s
13 029 tests +  77  12 482 ✔️ +  77  547 💤 ±0  0 ±0 
65 816 runs  +308  65 106 ✔️ +308  710 💤 ±0  0 ±0 

Results for commit 7c3794c. ± Comparison against base commit 0e0d1fc.

♻️ This comment has been updated with latest results.

DataColumn, DataColumnSidecar,
MAX_BLOB_COMMITMENTS_PER_BLOCK

from ../spec/datatypes/deneb import KzgCommitments, KzgProofs
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If one imports spec/forks, there's no reason to import either ../spec/datatypes/deneb or ../spec/datatypes/fulu explicitly. But it's not obvious me it needs/benefits from ../spec/forks to begin with -- what does it use it for?

partialColumns* {.
defaultValue: false,
desc: "Backward compatible partial data column sidecar support",
name: "partial-columns" .}: bool
Copy link
Copy Markdown
Contributor

@tersec tersec Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to leave this hidden and debug-foo for now.

If we need to make it visible and part of the official set of options, can do that easily, can't go back the other way as easily.

In particular, once eventually deployed, it should not need this option at all. It should just be on, by default, because that's how Ethereum will work.


var cellIdx = 0
for blobIdx in 0 ..< int(MAX_BLOB_COMMITMENTS_PER_BLOCK):
if sidecar.cells_present_bitmap[Natural(blobIdx)]:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is Natural necessary here?

quarantine: ref Quarantine,
blobQuarantine: ref BlobQuarantine,
dataColumnQuarantine: ref ColumnQuarantine,
partialColumnQuarantine: ref PartialColumnQuarantine,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ultimately it doesn't really make sense to have two different quarantines here, or?

If a full data column is present in the data column quarantine, then it should be representable in the more general partialColumnQuarantine. And right now, this leads to two possible quarantines to check.


ok()


Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra empty line between functions?

@@ -0,0 +1,973 @@
# beacon_chain
# Copyright (c) 2025-2026 Status Research & Development GmbH
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This module didn't exist in 2025.

colIdx = ColumnIndex(0)
numBlobs = 3

discard quarantine.getOrCreateEntry(root, colIdx, numBlobs)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

discard in a test

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


# Pick column index 0 for the round-trip test
let colIdx = ColumnIndex(0)
discard pcq.getOrCreateEntry(blockRoot, colIdx, numBlobs)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A discard here too

wallTime: BeaconTime, subnet_id: uint64):
Result[void, ValidationError] =

# === For all partial messages ===
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are there other partial messages? Also, this seems to be still fairly partial column-specific. Not really for "all partial messages"

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

technically there's a cell only situation and a cells + header situation

# TODO Is this an error?
beacon_block_production_errors.inc()
return head # Errors logged in router
when consensusFork != ConsensusFork.Fulu:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably the partial cell messaging would continue in some form into Gloas? I'm not sure if/how that's specified though.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but the experimental spec branch does not suggest that, i've tried to keep it that way

# Batch verify that the cells match the corresponding commitments and proofs
var commitments = newSeqOfCap[KzgCommitment](blobIndices.len)
for i in blobIndices:
commitments.add(all_commitments[i])
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop can be more clearly expressed as a mapIt:

  let commitments = blobIndices.mapIt(all_commitments[it])

return dag.checkedReject(
"PartialDataColumnSidecar: cells and proofs count mismatch")

let column_index = ColumnIndex(subnet_id)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subnet indices can't be uniquely mapped back to data column indices, e.g., see

# https://github.com/ethereum/consensus-specs/blob/v1.6.0-alpha.3/specs/fulu/p2p-interface.md#compute_subnet_for_data_column_sidecar
func compute_subnet_for_data_column_sidecar*(column_index: ColumnIndex): uint64 =
# Parts of Nimbus use the subnet number and column ID semi-interchangeably
static: doAssert DATA_COLUMN_SIDECAR_SUBNET_COUNT == NUMBER_OF_COLUMNS
column_index mod DATA_COLUMN_SIDECAR_SUBNET_COUNT
or compute_subnet_for_data_column_sidecar:

def compute_subnet_for_data_column_sidecar(column_index: ColumnIndex) -> SubnetID:
    return SubnetID(column_index % DATA_COLUMN_SIDECAR_SUBNET_COUNT)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To the extent this is true, as it currently is, this needs another one of those static: doAssert DATA_COLUMN_SIDECAR_SUBNET_COUNT == NUMBER_OF_COLUMNS assertions to verify that it's currently safe (but that might change in the future, and it should not compile the moment it isn't).


proc routeSignedBeaconBlock*(
router: ref MessageRouter,
blck: electra.SignedBeaconBlock | fulu.SignedBeaconBlock |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there ever going to be a reason to route a fulu.SignedBeaconBlock (or a gloas.SignedBeaconBlock, when the partial cell dissemination is defined for that fork) without the partialSidecars of the overload added in this PR? It seems like it should just not compile, insofar as it allows silently neglecting to send the partial sidecars.

Or is there a use case for this?

node.forkDigestAtEpoch(contextEpoch), subnet_id)
node.broadcast(topic, data_column)

proc broadcastPartialDataColumnHeader*(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing ever seems to call this function?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah i thought earlier there's a function to publish the header separately, since we have atomic header management in gossip validation function

var decompressed = snappy.decode(message.data, maxSize)
let res = if decompressed.len > 0:
block:
var result = ValidationResult.Reject
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

result has special meaning in Nim. Better to use res or any other non-literally-result name for this kind of variable.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here there's already a res, so could be a different variation or etc of res.

PartialColumnEntry* = object
## Tracks accumulated cells for a single (block_root, column_index) pair.
headerValidated*: bool
cellsReceived*: BitSeq
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this redundant with the isSome/isNone-ness of each cell/proof entry? Can they diverge?


let
block_root = hash_tree_root(block_header)
column_index = ColumnIndex(subnet_id)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only true if

  static: doAssert DATA_COLUMN_SIDECAR_SUBNET_COUNT == NUMBER_OF_COLUMNS

so if relying on this assertion implicitly, should add it explicitly, so if that ever changes, the compilation will immediately fail and the fix will be more obvious.

column_index: ColumnIndex):
Result[void, ValidationError] =
let res = p_data_column.verify_partial_data_column_sidecar_kzg_proofs(
all_commitments, column_index)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reduce indentation by 2 spaces.

router: ref MessageRouter,
blck: fulu.SignedBeaconBlock,
someSidecarsOpt: seq[fulu.DataColumnSidecar],
partialSidecars: seq[fulu.PartialDataColumnSidecar],
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally speaking, it seems like the someSidecarsOpt and partialSidecars should be constructible from each other, or?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes we are passing both to broadcast one and adding the other to block processor without wasting much time and resources converting

@tersec
Copy link
Copy Markdown
Contributor

tersec commented Mar 30, 2026

Merge conflicts in AllTests-mainnet.md and tests/test_block_processor.nim.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants