Skip to content

Rearrange bytes from split format to normal format for Unsigned Column types #357

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

Krmjn09
Copy link
Collaborator

@Krmjn09 Krmjn09 commented Jul 31, 2025

In this commit I can successfuly deserialiseunsigned slip columns
In the below image you can see the test I ran for compressed ntpl001_staff.root file where I checked for flag what was unsigned integer split column which is successfully deserialised but for other fields (e.g Nation) which is not unsigned I get the the error as expected.
Screenshot 2025-07-31 121619

@Krmjn09 Krmjn09 changed the base branch from master to dev July 31, 2025 06:59
@Krmjn09 Krmjn09 requested review from silverweed and linev July 31, 2025 06:59
const byteSize = getTypeByteSize(coltype),
splitView = new DataView(blob.buffer, blob.byteOffset, blob.byteLength),
count = blob.byteLength / byteSize,
fullBuffer = new ArrayBuffer(blob.byteLength),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than fullBuffer/fullBytes I'd call them outBuffer/outBytes

for (let b = 0; b < byteSize; ++b) {
const splitIndex = b * count + i,
byte = splitView.getUint8(splitIndex),
writeIndex = i * byteSize + (LITTLE_ENDIAN ? b : byteSize - 1 - b);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can assume LITTLE_ENDIAN is true (if that were not the case, several other things would break at the moment and there is no big endian platform I'm aware of where you would use a browser anyway)

coltype === ENTupleColumnType.kSplitReal64 ? ENTupleColumnType.kReal64 :
coltype === ENTupleColumnType.kSplitIndex64 ? ENTupleColumnType.kIndex64 :
coltype === ENTupleColumnType.kSplitUInt32 ? ENTupleColumnType.kUInt32 :
coltype;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should assert that coltype is one of those listed above; maybe use a switch instead of a chain of ternary operators and have a throw in the default case

Copy link
Collaborator Author

@Krmjn09 Krmjn09 Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check @silverweed the commit 2 I did all your suggested changes

numValues = byteSize ? blob.byteLength / byteSize : undefined;
reader = new RBufferReader(processedBlob),
values = [],
numValues = byteSize ? processedBlob.byteLength / byteSize : undefined;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it makes sense to continue running if byteSize is not defined - you should throw rather than running the loop on processedBlob.byteLength

@Krmjn09 Krmjn09 requested a review from silverweed July 31, 2025 07:39
@silverweed silverweed merged commit f7da60f into root-project:dev Aug 4, 2025
24 checks passed
@Krmjn09 Krmjn09 deleted the Split-Encoding-for-Unsigned-Values branch August 6, 2025 03:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants