TOON Format Explained: The Best JSON Alternative for LLM Cost Optimization
TOON Format Explained: The Best JSON Alternative for LLM Cost Optimization
If you're building AI applications, you've likely asked yourself: "How can I reduce my LLM API costs without compromising functionality?" The answer lies in TOON (Token-Oriented Object Notation) - a data format specifically designed to minimize token usage while maintaining full expressiveness.
In this comprehensive guide, we'll break down exactly how TOON works, its syntax rules, and why it's becoming the preferred format for AI developers.
What Makes TOON Different?
TOON was created with a single goal: minimize tokens while maximizing readability. Traditional formats like JSON were designed for general-purpose data interchange, not for the unique constraints of language models where every token costs money.
The Core Innovation
The breakthrough in TOON is its approach to arrays of objects. Instead of repeating field names for each item (like JSON does), TOON declares the schema once and presents data in a CSV-like tabular format.
JSON approach (repetitive):
{ "orders": [ {"id": "ORD001", "total": 299.99, "status": "shipped"}, {"id": "ORD002", "total": 149.50, "status": "pending"}, {"id": "ORD003", "total": 899.00, "status": "delivered"} ] }
Tokens: ~112
TOON approach (declarative):
orders[3]{id,total,status}: ORD001,299.99,shipped ORD002,149.50,pending ORD003,899.00,delivered
Tokens: ~58 (48% savings!)
TOON Syntax Guide
Let's break down every element of TOON syntax with clear examples.
1. Basic Key-Value Pairs
The simplest TOON structure is a key-value pair using colon notation:
name: John Smith age: 32 email: john@example.com active: true balance: 1250.50
Rules:
- No quotes needed for string values
- Boolean values:
trueorfalse - Numbers: written as-is (integers or decimals)
- Null values: use
null
2. Nested Objects
For hierarchical data, use indentation (like YAML):
user: profile: name: Sarah Johnson age: 28 location: New York preferences: theme: dark notifications: true language: en subscription: plan: premium expires: 2025-12-31
Indentation rules:
- Use 2 spaces per level (consistent spacing is critical)
- Child elements must be indented relative to parent
- Empty lines are optional but improve readability
3. Simple Arrays
For primitive arrays (strings, numbers, booleans), use bracket notation:
tags[4]: javascript,react,typescript,nodejs scores[5]: 95,87,92,88,90 flags[3]: true,false,true
Format: arrayName[count]: value1,value2,value3
Key points:
[count]specifies number of elements- Values are comma-separated
- No spaces around commas (spaces are tokens!)
4. Object Arrays (The Power Feature)
This is where TOON shines. For arrays of objects with uniform structure:
employees[4]{id,name,department,salary}: E001,Alice Johnson,Engineering,120000 E002,Bob Smith,Marketing,95000 E003,Carol Williams,Sales,105000 E004,David Brown,Engineering,115000
Anatomy of the declaration:
employees- array name[4]- number of elements{id,name,department,salary}- field names (schema):- declaration terminator- Following lines: data rows (one per element)
Critical rules:
- Each row must have exactly the number of fields declared
- Maintain field order consistently
- Each row is indented with 2 spaces
- Values are comma-separated
5. Complex Nested Structures
TOON handles arbitrarily complex nesting:
company: name: TechCorp Inc founded: 2020 headquarters: city: San Francisco country: USA address: 123 Market St departments[3]{name,headcount,budget}: Engineering,45,2500000 Sales,20,1200000 Marketing,15,800000 investors[2]{name,stake,amount}: Venture Capital A,15.5,5000000 Angel Investor B,8.2,2000000
This demonstrates:
- Nested objects (
headquarters) - Multiple object arrays at the same level
- Mixed data types (strings, numbers, decimals)
6. Handling Special Characters
Quotes for values with commas or special chars:
products[3]{id,name,description}: P001,Laptop,"High-performance laptop, 16GB RAM" P002,Mouse,Wireless optical mouse P003,Monitor,"27-inch 4K display, IPS panel"
Escaping quotes inside quoted strings:
reviews[2]{product,comment}: P001,"Customer said: \"Best laptop ever!\"" P002,"Very smooth, works great"
7. Empty Arrays and Null Values
active_users[0]{id,name,email}: pending_count: 0 last_login: null tags[0]:
Complete Example: E-commerce Order
Let's see a full real-world example:
JSON version (428 tokens):
{ "order": { "id": "ORD-2025-001", "date": "2025-01-18", "customer": { "id": "CUST-10234", "name": "Emily Rodriguez", "email": "emily@example.com", "tier": "gold" }, "items": [ { "sku": "LAPTOP-001", "product": "MacBook Pro 16", "quantity": 1, "price": 2499.00, "discount": 0.10 }, { "sku": "MOUSE-042", "product": "Magic Mouse", "quantity": 2, "price": 79.00, "discount": 0.00 }, { "sku": "KB-088", "product": "Mechanical Keyboard", "quantity": 1, "price": 149.00, "discount": 0.15 } ], "shipping": { "method": "express", "cost": 25.00, "address": "456 Oak Avenue, Seattle, WA 98101" }, "payment": { "method": "credit_card", "status": "completed", "transaction_id": "TXN-9876543210" }, "totals": { "subtotal": 2806.00, "discount": 271.35, "shipping": 25.00, "tax": 228.85, "total": 2788.50 } } }
TOON version (231 tokens - 46% savings!):
order: id: ORD-2025-001 date: 2025-01-18 customer: id: CUST-10234 name: Emily Rodriguez email: emily@example.com tier: gold items[3]{sku,product,quantity,price,discount}: LAPTOP-001,MacBook Pro 16,1,2499.00,0.10 MOUSE-042,Magic Mouse,2,79.00,0.00 KB-088,Mechanical Keyboard,1,149.00,0.15 shipping: method: express cost: 25.00 address: 456 Oak Avenue, Seattle, WA 98101 payment: method: credit_card status: completed transaction_id: TXN-9876543210 totals: subtotal: 2806.00 discount: 271.35 shipping: 25.00 tax: 228.85 total: 2788.50
Token Efficiency Analysis
Let's analyze why TOON saves so many tokens:
JSON Token Breakdown (for array of 100 items)
| Element | Occurrences | Tokens/Each | Total |
|---|---|---|---|
| Field names | 500 (5 fields × 100) | 1-2 | ~750 |
Braces {} | 100 pairs | 1 each | 200 |
Colons : | 500 | 1 | 500 |
Commas , | 499 | 1 | 499 |
Quotes "" | 1000+ | 1 each | 1000+ |
| Values | 500 | ~2 avg | 1000 |
| Total | ~3,949 |
TOON Token Breakdown (same 100 items)
| Element | Occurrences | Tokens/Each | Total |
|---|---|---|---|
| Array declaration | 1 | ~15 | 15 |
| Field names | 5 (once) | 1-2 | ~8 |
| Row structure | 100 | ~2 | 200 |
| Values | 500 | ~2 avg | 1000 |
| Commas | 495 | 1 | 495 |
| Total | ~1,718 |
Savings: 56.5% for this dataset!
Best Practices
DO:
✅ Use TOON for uniform arrays
transactions[100]{id,amount,date}: ...
✅ Keep field names short but meaningful
users[50]{id,name,email,status}:
✅ Indent consistently (2 spaces)
✅ Group related data
metrics: daily[7]{date,users,revenue}: ... monthly[12]{month,users,revenue}: ...
DON'T:
❌ Don't use TOON for highly nested, irregular structures
# This is awkward in TOON, better suited for JSON deeply: nested: irregular: structure: ...
❌ Don't mix indentation styles
# BAD: inconsistent spacing user: name: Alice # 3 spaces email: alice@example.com # 2 spaces
❌ Don't forget to count array elements correctly
# BAD: says [2] but has 3 items items[2]{id,name}: 1,First 2,Second 3,Third # Error!
Converting Between JSON and TOON
Automated Conversion
Use our converter at jsontotoonformatter.com for instant conversion:
import { jsonToToon, toonToJson } from '@/lib/toon-converter'; // JSON → TOON const json = JSON.stringify(data); const { output, tokenSavings } = jsonToToon(json); console.log(`Saved ${tokenSavings}% tokens`); // TOON → JSON const { output: jsonOutput } = toonToJson(toonData); const data = JSON.parse(jsonOutput);
Manual Conversion Tips
- Identify array patterns - Look for repeated object structures
- Extract schema - List unique field names
- Tabularize data - Convert to CSV-like rows
- Simplify nesting - Remove unnecessary braces/brackets
LLM Compatibility
Modern LLMs understand TOON natively:
GPT-4 Example
User: Analyze this sales data in TOON format:
sales[5]{region,q1,q2,q3,q4}:
North,250000,280000,320000,350000
South,180000,195000,210000,225000
East,320000,340000,360000,380000
West,210000,225000,240000,255000
Central,150000,160000,175000,190000
Which region had the highest growth?
GPT-4: The North region had the highest absolute growth,
increasing from $250K in Q1 to $350K in Q4 ($100K total growth, 40%).
No special instructions needed - the LLM parses it correctly!
Common Use Cases
1. Product Catalogs
products[500]{id,name,category,price,stock}: ...
Savings: ~52% vs JSON
2. User Analytics
daily_metrics[30]{date,users,sessions,revenue}: ...
Savings: ~48% vs JSON
3. Transaction Logs
transactions[1000]{id,timestamp,user,amount,status}: ...
Savings: ~55% vs JSON
4. API Responses for LLMs
search_results[20]{id,title,description,score}: ...
Savings: ~45% vs JSON
Performance Impact
Real-World Metrics
Application: Customer support chatbot
- Dataset: 200 knowledge base articles
- JSON size: 45,000 tokens
- TOON size: 24,000 tokens
- Reduction: 47%
Cost Impact (GPT-4 @ $0.03/1K tokens):
- Monthly calls: 100,000
- JSON cost: $135/month
- TOON cost: $72/month
- Annual savings: $756
Scale this to enterprise levels, and savings become substantial.
Conclusion
TOON format represents a paradigm shift in how we structure data for AI applications. By understanding its syntax and applying it correctly, you can:
- Reduce token usage by 30-60%
- Lower API costs significantly
- Fit more data in context windows
- Maintain full data fidelity
The format is human-readable, LLM-friendly, and production-ready. As AI applications become more prevalent, token efficiency will only grow in importance.
Ready to start using TOON? Try our converter and see the difference with your own data.
Questions about TOON syntax? Want to report an edge case? Reach out to our team or visit our documentation.
Ready to Optimize Your LLM Costs?
Try our free JSON to TOON converter and see your potential savings.
Convert to TOON NowTOON Team
Author at JSON to TOON Converter