Interpreting the TOON vs JSON Benchmark Results #195
Replies: 2 comments
-
|
Hey @jmfloreszazo, I appreciate the thourough analysis! Is there anything we can include in the TOON docs based on your analysis? |
Beta Was this translation helpful? Give feedback.
-
|
Thanks a lot for the reply! @johannschopplich Yes — the main thing I think would be valuable to include in the docs is about setting realistic expectations regarding token savings. Based on the benchmarks I ran (full invoice datasets, compact/minified JSON, and reasoning models like o1), TOON consistently delivers around ~25% input-token reduction versus already-minified JSON. So the “~25% vs minified JSON” figure would help align expectations, especially for teams integrating TOON into MCP pipelines or multi-agent architectures, where that 25% compounds across thousands of calls. Happy to contribute more details if helpful — and thanks again for your work! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
TOON Format: Benchmark & Architecture — what I found when comparing TOON vs JSON (and why the 25% saving actually matters)
Full article: https://medium.com/@jmfloreszazo
Benchmark repo (.NET / C#): https://github.com/jmfloreszazo/dotnet_llm_toon_format_demo
After running real benchmarks with full invoice datasets and reasoning models like o1, I found that TOON doesn’t deliver the 30–60% savings often claimed — but it does deliver a consistent ~25% reduction in input tokens compared to compact JSON, without losing accuracy or increasing latency.
It’s not magic, but it’s real architecture: in MCP pipelines and multi-agent systems, that 25% compounds across thousands of tool-calls and becomes meaningful FinOps impact.
What do you think?
Thanks for your work!
Beta Was this translation helpful? Give feedback.
All reactions