Skip to content

tylerjensen/toonmaker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ToonMaker

NuGet License: MIT .NET Standard 2.1

A high-performance .NET library for encoding JSON, YAML, and Markdown to TOON (Token-Oriented Object Notation) format, optimized for Large Language Model (LLM) consumption.

Overview

ToonMaker converts structured data formats into a compact, token-efficient representation specifically designed for AI/LLM applications. By reducing token count while preserving semantic meaning, ToonMaker helps reduce API costs and improve context window utilization.

Key Features:

  • 🚀 High Compression - Up to 42% reduction for complex JSON
  • 🎯 LLM-Optimized - Aggressive semantic compression preserves meaning
  • 📦 Multi-Format - Supports JSON, YAML, and Markdown
  • Zero Dependencies - Only System.Text.Json required
  • 🔧 Configurable - Multiple encoding modes for different use cases

Installation

Package Manager Console

Install-Package ToonMaker -Version 1.0.0

.NET CLI

dotnet add package ToonMaker

PackageReference

<PackageReference Include="ToonMaker" Version="1.0.0" />

Quick Start

using ToonMaker;

// Create encoder
var encoder = new ToonEncoder();

// Encode JSON
string json = """{"users":[{"id":1,"name":"Alice"},{"id":2,"name":"Bob"}]}""";
string toon = encoder.Encode(json);

// Encode YAML
string yaml = File.ReadAllText("config.yaml");
string toonYaml = encoder.EncodeYaml(yaml);

// Encode Markdown
string markdown = File.ReadAllText("README.md");
string toonMarkdown = encoder.EncodeMarkdown(markdown);

Encoding Modes

ToonMaker provides multiple encoding modes to balance compression ratio vs. readability:

Standard Mode

Preserves structure and readability while achieving moderate compression.

var encoder = new ToonEncoder();

// Standard encoding (default)
string result = encoder.Encode(json);           // JSON
string result = encoder.EncodeYaml(yaml);       // YAML - preserves comments
string result = encoder.EncodeMarkdown(md);     // Markdown

Aggressive Mode

Maximum compression with semantic shorthand optimized for LLM consumption.

var encoder = new ToonEncoder();

// Aggressive encoding for maximum token reduction
string result = encoder.EncodeYamlAggressive(yaml);       // Removes comments, compresses values
string result = encoder.EncodeMarkdownAggressive(md);     // Semantic compression

Aggressive mode transformations:

  • Removes articles: "the", "a", "an"
  • Replaces operators: "and" → "&", "or" → "|"
  • Abbreviates terms: "Configuration" → "Config", "Database" → "DB"
  • Strips markdown formatting: bold, italic, links
  • Removes YAML comments
  • Converts lists to compact TOON arrays

Compression Results

JSON Encoding

Source Result Original Encoded Compression Tokens Saved/1K Requests*
simple.json simple.toon 9,629 bytes 9,180 bytes 4.66% 112,250
medium.json medium.toon 70,032 bytes 55,403 bytes 20.89% 3,657,250
complex.json complex.toon 112,026 bytes 64,301 bytes 42.60% 11,931,250

Average JSON compression: 22.72%

YAML Encoding

Source Result Original Normal Aggressive Tokens Saved/1K (Normal)* Tokens Saved/1K (Aggressive)*
simple.yaml simple.toon 13,538 18.26% 19.94% 618,000 675,000
medium.yaml medium.toon 91,346 3.44% 4.10% 784,750 936,750
complex.yaml complex.toon 123,906 11.46% 12.35% 3,551,000 3,825,000

Average YAML compression: 11.05% (Normal), 12.13% (Aggressive)

Markdown Encoding

Source Result Original Normal Aggressive Tokens Saved/1K (Normal)* Tokens Saved/1K (Aggressive)*
simple.md simple.toon 13,024 bytes 6.56% 9.71% 213,500 316,000
medium.md medium.toon 19,673 bytes 0.95% 3.50% 46,750 172,250
complex.md complex.toon 40,799 bytes -0.46% 0.81% -46,750 82,750

Average Markdown compression: 2.35% (Normal), 4.67% (Aggressive)

*Tokens Saved per 1,000 Requests = ((Original bytes - Encoded bytes) / 4) × 1,000, assuming 1 token ≈ 4 characters on average.

Understanding Compression Differences

Why JSON Compresses Best (22-43%)

JSON achieves the highest compression because:

  1. Structural Redundancy - JSON uses braces {}, brackets [], quotes "", and colons : extensively. TOON replaces these with indentation-based structure.

  2. Repetitive Key Names - Arrays of objects repeat field names. TOON uses tabular format:

    [{"id":1,"name":"Alice"},{"id":2,"name":"Bob"}]

    Becomes:

    [2]{id,name}:
      1,Alice
      2,Bob
    
  3. Quote Elimination - TOON only quotes strings when necessary (containing special characters).

  4. Complexity Correlation - More complex/nested JSON has more redundancy to eliminate, hence higher compression for complex files.

Why Simpler JSON Files Compress Less (4-5%)

Simple JSON files have:

  • Fewer nested structures (less brace/bracket overhead)
  • Shorter arrays (less key repetition benefit)
  • More unique values relative to structure
  • Higher data-to-syntax ratio already

Why YAML/Markdown Compress Less

YAML already uses indentation-based structure similar to TOON, but still achieves good compression:

  • Simple array syntax conversion saves significant space
  • Multiline strings are collapsed efficiently
  • Key-value syntax is streamlined
  • Average 11-12% compression achieved

Markdown is prose-heavy:

  • Text content cannot be compressed without semantic loss
  • Minimal structural overhead to remove
  • Headers and lists are already compact
  • Tables provide some compression opportunity
  • List items converted to compact TOON arrays
  • Bold/italic markers removed for cleaner output

Solution: Aggressive Mode

Aggressive mode addresses these by applying semantic compression:

  • Comment removal saves space on verbose YAML
  • Article/phrase compression saves ~3-8% on English text
  • Key abbreviations for common terms

Advanced Configuration

Encoding Options

var options = new ToonEncoder.EncodeOptions
{
    // Indentation
    Indent = 2,                           // Spaces per level (default: 2)
    
    // Array delimiters
    Delimiter = ToonEncoder.Delimiter.Pipe,  // Pipe (default), Comma, or Tab
    
    // Key folding - collapse wrapper chains
    KeyFolding = true,                    // data.metadata.items: instead of nested
    FlattenDepth = 3,                     // Max depth to fold
    
    // Output style
    DslStyle = true,                      // Section headers for arrays
    YamlTables = true,                    // YAML-style for tabular data
    ExpandTableRows = false,              // Single-line vs expanded rows
    
    // Nested array extraction
    ExtractNestedArrays = 3               // Extract nested arrays as sections
};

var encoder = new ToonEncoder(options);

Static Methods

// Quick encoding with defaults
string toon = ToonEncoder.ToToon(json);

// Tight encoding (tab delimiter + key folding)
string tight = ToonEncoder.ToToonTight(json);

Optional Headers

Add context headers for LLM consumption:

string toon = encoder.Encode(json, "# API Response Data\n");
string toon = encoder.EncodeYaml(yaml, "## Configuration Settings");
string toon = encoder.EncodeMarkdown(md, "# Document Summary");

API Reference

IToonEncoder Interface

public interface IToonEncoder
{
    // JSON encoding
    string Encode(string json, string? markdownHeader = null);
    string Encode(JsonElement element, string? markdownHeader = null);
    
    // YAML encoding
    string EncodeYaml(string yaml, string? markdownHeader = null);
    
    // Markdown encoding
    string EncodeMarkdown(string markdown, string? markdownHeader = null);
}

ToonEncoder Class

Method Description
Encode(string json) Encode JSON string to TOON
Encode(JsonElement element) Encode JsonElement to TOON
EncodeYaml(string yaml) Encode YAML with structure optimization
EncodeYamlAggressive(string yaml) Encode YAML with semantic compression
EncodeMarkdown(string markdown) Encode Markdown with formatting cleanup
EncodeMarkdownAggressive(string markdown) Encode Markdown with semantic compression

TOON Format Specification

TOON (Token-Oriented Object Notation) achieves compression through:

  1. Indentation-based structure - No braces or brackets
  2. Tabular arrays - [N]{fields}: header with CSV-like rows
  3. Minimal quoting - Only when necessary
  4. Key folding - data.config.items: instead of nested levels

Example transformation:

{
  "users": [
    {"id": 1, "name": "Alice", "role": "admin"},
    {"id": 2, "name": "Bob", "role": "user"}
  ]
}

Becomes:

users[2]{id,name,role}:
  1,Alice,admin
  2,Bob,user

Based on TOON Specification v2.0

Requirements

  • .NET Standard 2.1 or higher
  • System.Text.Json 8.0.5+

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Author

Tyler Jensen - GitHub


ToonMaker - Compress your data, expand your context window.

About

A .NET TOON Encoder for JSON, YAML, and Markdown

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages