You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.md
+34-38Lines changed: 34 additions & 38 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,87 +5,83 @@ All notable changes to the TOON specification will be documented in this file.
5
5
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
6
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
8
+
## [2.1] - 2025-11-23
9
+
10
+
### Changed
11
+
12
+
- Canonical encoding for objects as list items (§10):
13
+
- Encoders SHOULD emit `- key[N]{fields}:` only when the list-item object has exactly one field and that field is a tabular array.
14
+
- In all other cases, encoders SHOULD emit a bare `-` line and place all fields at depth +1; tabular array headers then appear at depth +1 and their rows at depth +2.
15
+
8
16
## [2.0] - 2025-11-10
9
17
10
18
### Breaking Changes
11
19
12
-
-**Removed:** Length marker (`#`) prefix in array headers has been completely removed from the specification
13
-
- The `[#N]` format is no longer valid syntax. All array headers MUST use `[N]` format only
14
-
- Encoders MUST NOT emit `[#N]` format
15
-
- Decoders MUST NOT accept `[#N]` format (breaking change from v1.5)
20
+
- Removed `[#N]` length-marker syntax in array headers; `[N]` is now the only valid format.
21
+
- Encoders MUST NOT emit `[#N]`; decoders MUST reject it.
16
22
17
23
### Removed
18
24
19
-
- All references to length marker from terminology (§1.4), header syntax (§6), ABNF grammar, conformance requirements (§13.2), and parsing helpers (Appendix B)
20
-
-`lengthMarker` encoder option removed from all implementations
21
-
- Length marker test fixtures removed
25
+
- The `lengthMarker` encoder option and any CLI flags exposing it.
22
26
23
27
### Migration from v1.5
24
28
25
-
- Update decoder implementations to reject `[#N]` syntax
26
-
- Convert any existing `.toon` files using `[#N]` format to `[N]` format
27
-
- Remove `lengthMarker` option from encoder configurations
28
-
- Remove `--length-marker` CLI flags if present
29
+
- Update decoders to reject `[#N]` syntax.
30
+
- Convert existing `.toon` files using `[#N]` to `[N]`.
31
+
- Remove `lengthMarker` configuration and CLI options.
29
32
30
33
## [1.5] - 2025-11-08
31
34
32
35
### Added
33
36
34
-
- Optional key folding for encoders: `keyFolding="safe"` mode with `flattenDepth` control to collapse single-key object chains into dotted-path notation (§13.4)
35
-
- Optional path expansion for decoders: `expandPaths="safe"` mode to split dotted keys into nested objects, with conflict resolution tied to `strict` option (§13.4, §14.5)
36
-
- IdentifierSegment terminology and path separator definition (fixed to `"."` in v1.5) (§1.9)
37
-
- Deep-merge semantics for path expansion: recursive merge for objects, error on conflict when `strict=true`, last-write-wins (LWW) when `strict=false` (§13.4)
37
+
- Optional key folding for encoders: `keyFolding="safe"` with `flattenDepth` to collapse single-key object chains into dotted paths (§13.4).
38
+
- Optional path expansion for decoders: `expandPaths="safe"` to split dotted keys into nested objects with deep-merge semantics and conflict handling tied to `strict` (§13.4, §14.5).
39
+
- IdentifierSegment terminology and fixed `"."` path separator for safe folding/expansion (§1.9).
38
40
39
41
### Changed
40
42
41
-
-Both new features default to OFF and are fully backward-compatible
42
-
-Safe-mode folding requires IdentifierSegment validation, collision avoidance, and no quoting
43
+
-Safe-mode folding requires IdentifierSegment-only segments, no path separator in segments, no quoting, and collision avoidance.
44
+
-Both features default to `off`and are backward-compatible.
43
45
44
46
## [1.4] - 2025-11-05
45
47
46
48
### Changed
47
49
48
-
- Removed JavaScript-specific normalization details from specification; replaced with language-agnostic requirements (Section 3)
49
-
- Defined canonical number format for encoders: no exponent notation, no trailing zeros, no leading zeros except "0" (Section 2)
50
-
- Clarified decoder handling of exponent notation and out-of-range numbers (Section 2)
51
-
- Expanded `\w` regex notation to explicit character class `[A-Za-z0-9_]` for cross-language clarity (Section 7.3)
52
-
- Clarified non-strict mode tab handling as implementation-defined (Section 12)
50
+
- Generalized normalization rules and defined canonical number format for encoders (no exponent notation, no trailing zeros, no leading zeros except `"0"`), plus decoder handling of exponent forms and out-of-range numbers (§2-§3).
51
+
- Replaced `\w` with explicit `[A-Za-z0-9_]` in key regexes for cross-language clarity (§7.3).
52
+
- Clarified non-strict mode tab handling as implementation-defined (§12).
53
53
54
54
### Added
55
55
56
-
- Appendix G: Host Type Normalization Examples with guidance for Go, JavaScript, Python, and Rust implementations
56
+
- Appendix G with host-type normalization examples for Go, JavaScript, Python, and Rust.
57
57
58
58
## [1.3] - 2025-10-31
59
59
60
60
### Added
61
61
62
-
- Numeric precision requirements: JavaScript implementations SHOULD use `Number.toString()` precision (15-17 digits), all implementations MUST preserve round-trip fidelity (Section 2)
- Numeric precision requirements: JavaScript implementations SHOULD use `Number.toString()` precision (15–17 digits); all implementations MUST preserve round-trip fidelity (§2).
This repository contains the official specification for **Token-Oriented Object Notation (TOON)**, a compact, human-readable encoding of the JSON data model for LLM prompts. It provides a lossless serialization of the same objects, arrays, and primitives as JSON, but in a syntax that minimizes tokens and makes structure easy for models to follow.
@@ -10,7 +10,7 @@ This repository contains the official specification for **Token-Oriented Object
10
10
11
11
[→ Read the full specification (SPEC.md)](./SPEC.md)
Copy file name to clipboardExpand all lines: SPEC.md
+32-34Lines changed: 32 additions & 34 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,9 +2,9 @@
2
2
3
3
## Token-Oriented Object Notation
4
4
5
-
**Version:** 2.0
5
+
**Version:** 2.1
6
6
7
-
**Date:** 2025-11-10
7
+
**Date:** 2025-11-23
8
8
9
9
**Status:** Working Draft
10
10
@@ -20,7 +20,7 @@ Token-Oriented Object Notation (TOON) is a line-oriented, indentation-based text
20
20
21
21
## Status of This Document
22
22
23
-
This document is a Working Draft v2.0 and may be updated, replaced, or obsoleted. Implementers should monitor the canonical repository at https://github.com/toon-format/spec for changes.
23
+
This document is a Working Draft v2.1 and may be updated, replaced, or obsoleted. Implementers should monitor the canonical repository at https://github.com/toon-format/spec for changes.
24
24
25
25
This specification is stable for implementation but not yet finalized. Breaking changes may occur in future major versions.
26
26
@@ -499,7 +499,15 @@ Decoding:
499
499
For an object appearing as a list item:
500
500
501
501
- Empty object list item: a single "-" at the list-item indentation level.
502
-
- First field on the hyphen line:
502
+
- Encoding selection (normative):
503
+
- When an object has **exactly one field** and that field encodes to a tabular array, encoders SHOULD use the compact form with the tabular header on the hyphen line:
504
+
- Tabular array: - key[N<delim?>]{fields}:
505
+
- Followed by tabular rows at depth +1 (relative to the hyphen line).
506
+
- For all other cases (multiple fields, or single non-tabular field), encoders SHOULD emit a bare hyphen on its own line:
507
+
- Bare hyphen: -
508
+
- All fields appear at depth +1 under the hyphen line in encounter order, using normal object field rules (Section 8).
509
+
- When a field is a tabular array, its header appears at depth +1 and its rows at depth +2 (relative to the hyphen line).
510
+
- First field on the hyphen line (legacy encoding, still valid for decoding):
503
511
- Primitive: - key: value
504
512
- Primitive array: - key[M<delim?>]: v1<delim>…
505
513
- Tabular array: - key[N<delim?>]{fields}:
@@ -508,7 +516,7 @@ For an object appearing as a list item:
508
516
- Followed by list items at depth +1.
509
517
- Object: - key:
510
518
- Nested object fields appear at depth +2 (i.e., one deeper than subsequent sibling fields of the same list item).
511
-
- Remaining fields of the same object appear at depth +1 under the hyphen line in encounter order, using normal object field rules.
519
+
- Remaining fields of the same object appear at depth +1 under the hyphen line in encounter order, using normal object field rules.
512
520
513
521
Decoding:
514
522
- The first field is parsed from the hyphen line. If it is a nested object (- key:), nested fields are at +2 relative to the hyphen line; subsequent fields of the same list item are at +1.
@@ -992,12 +1000,15 @@ items[2]:
992
1000
Nested tabular inside a list item:
993
1001
```
994
1002
items[1]:
995
-
- users[2]{id,name}:
996
-
1,Ada
997
-
2,Bob
1003
+
-
1004
+
users[2]{id,name}:
1005
+
1,Ada
1006
+
2,Bob
998
1007
status: active
999
1008
```
1000
1009
1010
+
Note: Encoders use this format (bare hyphen with all fields indented) for objects with multiple fields. Older encodings may place the first field on the hyphen line; both are valid for decoders.
This appendix summarizes major changes between spec versions. For the complete changelog, see [`CHANGELOG.md`](./CHANGELOG.md) in the specification repository.
1237
+
1238
+
### v2.1 (2025-11-23)
1239
+
1240
+
- Tightened canonical encoding for objects as list items (§10): bare `-` for multi-field objects, compact `- key[N]{fields}:` only for single-field tabular arrays, to improve visual consistency and LLM readability.
1241
+
1225
1242
### v2.0 (2025-11-10)
1226
1243
1227
-
- Breaking change: Length marker (`#`) prefix in array headers has been completely removed from the specification.
1228
-
- The `[#N]` format is no longer valid syntax. All array headers MUST use `[N]` format only.
1229
-
- Encoders MUST NOT emit `[#N]` format.
1230
-
- Decoders MUST NOT accept `[#N]` format (breaking change from v1.5).
1231
-
- Removed all references to length marker from terminology, grammar, conformance requirements, and parsing helpers.
1244
+
- Removed `[#N]` length-marker syntax from array headers; `[N]` is now the only valid form.
1232
1245
1233
1246
### v1.5 (2025-11-08)
1234
1247
1235
-
- Added optional key folding for encoders: `keyFolding='safe'`mode with `flattenDepth` control (§13.4).
1236
-
- Added optional path expansion for decoders: `expandPaths='safe'`mode with conflict resolution tied to existing `strict` option (§13.4).
1237
-
- Defined safe-mode requirements for folding: IdentifierSegment validation, no path separator in segments, collision avoidance, no quoting required (§7.3, §13.4).
1238
-
- Specified deep-merge semantics for expansion: recursive merge for objects; conflict policy (error in strict mode, LWW when strict=false) for non-objects (§13.4).
1239
-
- Added strict-mode error category for path expansion conflicts (§14.5).
1240
-
- Both features default to OFF; fully backward-compatible.
1248
+
- Added optional key folding (`keyFolding="safe"`) and path expansion (`expandPaths="safe"`) with deep-merge semantics and strict-mode conflict handling (§13.4, §14.5).
0 commit comments