Skip to content

[RFC v4.0] Mixed columnar arrays, objectArrayLayout, ignoreNullOrEmpty, excludeEmptyArrays#47

Draft
vhafdal wants to merge 1 commit into
toon-format:mainfrom
vhafdal:v4.0-draft
Draft

[RFC v4.0] Mixed columnar arrays, objectArrayLayout, ignoreNullOrEmpty, excludeEmptyArrays#47
vhafdal wants to merge 1 commit into
toon-format:mainfrom
vhafdal:v4.0-draft

Conversation

@vhafdal
Copy link
Copy Markdown

@vhafdal vhafdal commented Apr 23, 2026

Summary

This draft PR proposes four new features for TOON v4.0, contributed from a community implementation. The changes are submitted as an RFC for community review before any merge decision.

Proposed Changes

§9.3.2 — Mixed Columnar Arrays (BREAKING)

A new encoding form for object arrays that contain both primitive and complex fields. Primitive fields are extracted into a columnar header; complex fields follow each row as spill lines at depth +2.

Example (indent = 2):
```
[2]{id,name,score}:
1,Alice,9.5
tags[2]: a,b
address:
city: NY
2,Bob,7.0
tags[1]: c
address:
city: LA
```

Breaking impact: v3.x strict-mode decoders will error on this output. v4.0 decoders MUST detect and decode both §9.3 tabular and §9.3.2 columnar forms using the spill-line detection algorithm defined in the new section.

§13.5 — objectArrayLayout encoder option

  • "auto" (default): preserves existing v3.x tabular detection
  • "columnar": activates mixed columnar encoding (§9.3.2)

§13.6 — ignoreNullOrEmpty encoder option (boolean, default true)

Omit object fields whose value is null or "". In columnar arrays, suppress entire columns where all rows are null/empty. Lossy — implementations must document.

§13.7 — excludeEmptyArrays encoder option (boolean, default true)

Omit array-valued fields when the array length is 0. Lossy — implementations must document.

Appendix G.6 — Binary/byte array guidance (non-normative)

Recommendation for typed implementations encoding binary data: Base64 string (default) or numeric array. Both are valid TOON.

Test Fixtures

24 new test fixtures across three files:

  • tests/fixtures/encode/object-array-layout.json — 9 columnar encoding tests
  • tests/fixtures/encode/null-empty.json — 8 ignoreNullOrEmpty tests
  • tests/fixtures/encode/empty-arrays.json — 7 excludeEmptyArrays tests

Versioning

Per VERSIONING.md: the columnar layout introduces new required decoder behavior → MAJOR bump (v4.0). The three encoder options produce valid v3.x output and would qualify as MINOR in isolation, but are bundled here since they interact with the columnar feature.

Open Questions for Community Review

  1. Should objectArrayLayout="auto" also decode columnar documents, or should decoding require an explicit option?
  2. Should ignoreNullOrEmpty and excludeEmptyArrays also be available as standalone v3.1 additions for non-columnar use cases?
  3. Should the default values for ignoreNullOrEmpty and excludeEmptyArrays be false to preserve round-trip fidelity by default?

🤖 Generated with Claude Code

…options

Adds a v4.0 draft proposing four extensions contributed by the .NET community
implementation (toon-format/DevOp.Toon):

- §9.3.2: Mixed Columnar Arrays — new encoding form for object arrays with
  both primitive and complex fields; primitive fields form a columnar header,
  complex fields follow as per-row spill lines at depth +2.
- §13.5: objectArrayLayout encoder option ("auto" | "columnar")
- §13.6: ignoreNullOrEmpty encoder option (bool, default true) — suppress
  all-null/empty fields and columnar columns
- §13.7: excludeEmptyArrays encoder option (bool, default true) — suppress
  zero-length array fields
- Appendix G.6: non-normative guidance for typed binary/byte array encoding

Includes 24 new test fixtures (encode/object-array-layout.json,
encode/null-empty.json, encode/empty-arrays.json) and strict-mode error
entries for columnar row count/width mismatches.

BREAKING: v3.x strict-mode decoders will error on columnar-encoded output.
v4.0 decoders MUST detect and decode both §9.3 tabular and §9.3.2 columnar
forms.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@vhafdal
Copy link
Copy Markdown
Author

vhafdal commented Apr 24, 2026

Regarding these improvments in the protcol
I have implmented them for .net here
https://github.com/vhafdal/DevOp.Toon

And enabled support for Toon in api
https://github.com/vhafdal/DevOp.Toon.API

Updated a production system to move from json to toon
I do not have any concreet data yet but see that data being transfered via tcp has decreased about 30% and seems to have also positive effect on memory and cpu.
Will try to pull some production stats.

So the customizations suggested I think will help Toon to evolve and possibly be a contender to Json ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant