diff --git a/docs.json b/docs.json
index b19bf8b..cd6ad57 100644
--- a/docs.json
+++ b/docs.json
@@ -81,8 +81,7 @@
                 "pages": [
                   "features/overview",
                   "features/token-compression",
-                  "features/observability",
-                  "features/automatic-model-selection"
+                  "features/observability"
                 ]
               },
               {
diff --git a/features/automatic-model-selection.mdx b/features/automatic-model-selection.mdx
index b58bd66..067f5ad 100644
--- a/features/automatic-model-selection.mdx
+++ b/features/automatic-model-selection.mdx
@@ -6,6 +6,10 @@ icon: circuit-board
 
 Edgee's automatic model selection routes requests to the optimal model based on your priorities. Combined with token compression, it can reduce total AI costs by 60-70%.
 
+<Warning>
+This feature is under active development. Some routing strategies and configuration options may be added in future releases.
+</Warning>
+
 ## Cost-Aware Routing
 
 Let Edgee automatically select the cheapest model that meets your quality requirements:
@@ -19,8 +23,9 @@ const response = await edgee.send({
 });
 
 console.log(`Model used: ${response.model}`); // e.g., "gpt-5.2"
-console.log(`Cost: $${response.cost.toFixed(4)}`);
-console.log(`Tokens saved (compression): ${response.usage.saved_tokens}`);
+if (response.compression) {
+  console.log(`Tokens saved: ${response.compression.saved_tokens}`);
+}
 ```
 
 **How it works:**
@@ -201,6 +206,4 @@ await edgee.routing.addRule({
   </Card>
 </CardGroup>
 
-<Warning>
-This feature is under active development. Some routing strategies and configuration options may be added in future releases.
-</Warning>
+
diff --git a/features/observability.mdx b/features/observability.mdx
index 7cdad18..886c89d 100644
--- a/features/observability.mdx
+++ b/features/observability.mdx
@@ -6,9 +6,9 @@ icon: eye
 
 Edgee provides complete visibility into your AI infrastructure with real-time metrics on costs, token usage, compression savings, performance, and errors. Every request is tracked and exportable for analysis, budgeting, and optimization.
 
-## Cost Tracking
+## Token Usage Tracking
 
-Every Edgee response includes detailed cost information so you can track spending in real-time:
+Every Edgee response includes detailed token usage information for tracking and cost analysis:
 
 ```typescript
 const response = await edgee.send({
@@ -16,13 +16,19 @@ const response = await edgee.send({
   input: 'Your prompt here',
 });
 
-console.log(response.cost); // Total cost in USD (e.g., 0.0234)
 console.log(response.usage.prompt_tokens); // Compressed input tokens
 console.log(response.usage.completion_tokens); // Output tokens
 console.log(response.usage.total_tokens); // Total for billing
+
+// Compression savings (when applied)
+if (response.compression) {
+  console.log(response.compression.input_tokens); // Original tokens
+  console.log(response.compression.saved_tokens); // Tokens saved
+  console.log(`${(response.compression.rate * 100).toFixed(1)}%`); // Compression rate
+}
 ```
 
-**Track spending by:**
+**Track usage by:**
 - Model (GPT-4o vs Claude vs Gemini)
 - Project or application
 - Environment (production vs staging)
@@ -30,7 +36,7 @@ console.log(response.usage.total_tokens); // Total for billing
 - Time period (daily, weekly, monthly)
 
 <Note>
-  Costs are calculated using real-time provider pricing. Edgee automatically handles rate changes and updates your historical data accordingly.
+  Use token usage data with provider pricing to calculate costs. The Edgee dashboard automatically calculates costs based on real-time provider pricing.
 </Note>
 
 ## Request Tags for Analytics
@@ -143,13 +149,13 @@ If you're using the OpenAI or Anthropic SDKs with Edgee, add tags via the `x-edg
 **Common tagging strategies:**
 
 <CardGroup cols={2}>
-  <Card icon="layer-group" iconType="duotone">
+  <Card icon="package" iconType="duotone">
     **Environment tagging**
 
     Tag by environment: `production`, `staging`, `development`
   </Card>
 
-  <Card icon="puzzle-piece" iconType="duotone">
+  <Card icon="tag" iconType="duotone">
     **Feature tagging**
 
     Tag by feature: `chat`, `summarization`, `code-generation`, `rag-qa`
@@ -180,13 +186,16 @@ See exactly how much token compression is saving you on every request:
 const response = await edgee.send({
   model: 'gpt-4o',
   input: 'Long prompt with lots of context...',
+  enable_compression: true,
 });
 
 // Compression details
-console.log(response.usage.prompt_tokens_original); // Original token count
-console.log(response.usage.prompt_tokens); // After compression
-console.log(response.usage.saved_tokens); // Tokens saved
-console.log(response.usage.compression_ratio); // Percentage reduction (e.g., 45%)
+if (response.compression) {
+  console.log(response.compression.input_tokens); // Original token count
+  console.log(response.usage.prompt_tokens); // After compression
+  console.log(response.compression.saved_tokens); // Tokens saved
+  console.log(`${(response.compression.rate * 100).toFixed(1)}%`); // Compression rate (e.g., 61.0%)
+}
 ```
 
 **Analyze compression effectiveness:**
@@ -208,13 +217,13 @@ console.log(response.usage.compression_ratio); // Percentage reduction (e.g., 45
     Track compression ratios over time to identify optimization opportunities
   </Card>
 
-  <Card icon="layer-group" iconType="duotone">
+  <Card icon="layers" iconType="duotone">
     **By use case**
 
     Compare compression effectiveness across different prompt types
   </Card>
 
-  <Card icon="ranking-star" iconType="duotone">
+  <Card icon="badge-dollar-sign" iconType="duotone">
     **Top savers**
 
     Identify which requests generate the highest savings
@@ -265,7 +274,7 @@ Understand how your AI infrastructure is being used:
 - Cost per model over time
 - Model switching patterns
 
-## Alerts & Budgets
+## Alerts & Budgets (Coming Soon)
 
 Stay in control with proactive alerts:
 
@@ -324,32 +333,6 @@ const data = await edgee.analytics.export({
 });
 ```
 
-## Dashboard Views
-
-The Edgee dashboard provides pre-built views for common use cases:
-
-<CardGroup cols={2}>
-  <Card title="Cost Overview" icon="dollar-sign" iconType="duotone">
-    Track spending trends, compare models, and identify cost optimization opportunities.
-  </Card>
-
-  <Card title="Compression Analytics" icon="compress" iconType="duotone">
-    Monitor token savings, compression ratios, and cumulative cost reductions.
-  </Card>
-
-  <Card title="Performance" icon="gauge" iconType="duotone">
-    Analyze latency, throughput, error rates, and provider health across regions.
-  </Card>
-
-  <Card title="Usage Patterns" icon="chart-bar" iconType="duotone">
-    Understand request volume, model distribution, and usage trends over time.
-  </Card>
-</CardGroup>
-
-<Note>
-  Dashboard access is included with all Edgee plans. Enterprise customers can customize dashboards and create team-specific views.
-</Note>
-
 ## What's Next
 
 <CardGroup cols={2}>
diff --git a/features/token-compression.mdx b/features/token-compression.mdx
index 2080833..9eeb1f9 100644
--- a/features/token-compression.mdx
+++ b/features/token-compression.mdx
@@ -40,6 +40,106 @@ Token compression happens automatically on every request through a four-step pro
   Compression is most effective for prompts with repeated context (RAG), long system instructions, or verbose multi-turn histories. Simple queries may see minimal compression.
 </Note>
 
+## Enabling Token Compression
+
+Token compression can be enabled in three ways, giving you flexibility to control compression at the request, API key, or organization level:
+
+### 1. Per Request (SDK)
+
+Enable compression for specific requests using the SDK:
+
+<Tabs>
+  <Tab title="TypeScript">
+    ```typescript
+    const response = await edgee.send({
+      model: 'gpt-4o',
+      input: {
+        "messages": [
+          {"role": "user", "content": "Your prompt here"}
+        ],
+        "enable_compression": true,
+        "compression_rate": 0.8  // Target 80% compression (optional)
+      }
+    });
+    ```
+  </Tab>
+
+  <Tab title="Python">
+    ```python
+    response = edgee.send(
+        model="gpt-4o",
+        input={
+            "messages": [
+                {"role": "user", "content": "Your prompt here"}
+            ],
+            "enable_compression": True,
+            "compression_rate": 0.8  # Target 80% compression (optional)
+        }
+    )
+    ```
+  </Tab>
+
+  <Tab title="Go">
+    ```go
+    response, err := client.Send("gpt-4o", edgee.InputObject{
+        Messages: []edgee.Message{
+            {Role: "user", Content: "Your prompt here"},
+        },
+        EnableCompression: true,
+        CompressionRate: 0.8, // Target 80% compression (optional)
+    })
+    ```
+  </Tab>
+
+  <Tab title="Rust">
+    ```rust
+    let input = InputObject::new(vec![Message::user("Your prompt here")])
+        .with_compression(true)
+        .with_compression_rate(0.8); // Target 80% compression (optional)
+
+    let response = client.send("gpt-4o", input).await?;
+    ```
+  </Tab>
+</Tabs>
+
+### 2. Per API Key (Console)
+
+Enable compression for specific API keys in your organization settings. This is useful when you want different compression settings for different applications or environments.
+
+<Frame>
+<img src="/images/compression-enabled-by-tag-light.png" alt="Enable compression for specific API keys" className="dark:hidden" />
+<img src="/images/compression-enabled-by-tag-dark.png" alt="Enable compression for specific API keys" className="hidden dark:block" />
+</Frame>
+
+In the **Tools** section of your console:
+1. Toggle **Enable token compression** on
+2. Set your target **Compression rate** (0.7-0.9, default 0.75)
+3. Under **Scope**, select **Apply to specific API keys**
+4. Choose which API keys should use compression
+
+### 3. Organization-Wide (Console)
+
+Enable compression for all requests across your entire organization. This is the recommended setting for most users to maximize savings automatically.
+
+<Frame>
+<img src="/images/compression-enabled-org-light.png" alt="Enable compression organization-wide" className="dark:hidden" />
+<img src="/images/compression-enabled-org-dark.png" alt="Enable compression organization-wide" className="hidden dark:block" />
+</Frame>
+
+In the **Tools** section of your console:
+1. Toggle **Enable token compression** on
+2. Set your target **Compression rate** (0.7-0.9, default 0.75)
+3. Under **Scope**, select **Apply to all org requests**
+4. All API keys will now use compression by default
+
+<Tip>
+  **Compression rate** controls how aggressively Edgee compresses prompts. A higher rate (e.g., 0.9) attempts more compression but may be less effective, while a lower rate (e.g., 0.7) is more conservative. The default of 0.75 provides a good balance for most use cases.
+</Tip>
+
+<Note>
+  SDK-level configuration takes precedence over console settings. If you enable compression in your code with `enable_compression: true`, it will override the console configuration for that specific request.
+</Note>
+
 ## When It Works Best
 
 Token compression delivers the highest savings for these common use cases:
@@ -89,16 +189,19 @@ const documents = [
 const response = await edgee.send({
   model: 'gpt-4o',
   input: `Answer the question based on these documents:\n\n${documents.join('\n\n')}\n\nQuestion: What is the main topic?`,
+  enable_compression: true, // Enable compression for this request
+  compression_rate: 0.8, // Target compression ratio (0-1, e.g., 0.8 = 80%)
 });
 
 console.log(response.text);
 
 // Compression metrics
-console.log(`Original tokens: ${response.usage.prompt_tokens_original}`);
-console.log(`Compressed tokens: ${response.usage.prompt_tokens}`);
-console.log(`Tokens saved: ${response.usage.saved_tokens}`);
-console.log(`Compression ratio: ${response.usage.compression_ratio}%`);
-console.log(`Request cost: $${response.cost.toFixed(4)}`);
+if (response.compression) {
+  console.log(`Original tokens: ${response.compression.input_tokens}`);
+  console.log(`Compressed tokens: ${response.usage.prompt_tokens}`);
+  console.log(`Tokens saved: ${response.compression.saved_tokens}`);
+  console.log(`Compression rate: ${(response.compression.rate * 100).toFixed(1)}%`);
+}
 ```
 
 **Example output:**
@@ -107,7 +210,6 @@ Original tokens: 2,450
 Compressed tokens: 1,225
 Tokens saved: 1,225
 Compression ratio: 50%
-Request cost: $0.0184
 ```
 
 ## Real-World Savings
@@ -145,7 +247,7 @@ Here's what token compression means for your monthly AI bill:
   <Accordion title="Configure compression per use case">
     - Enable compression by default for all requests
     - Compression happens automatically without configuration
-    - Track `compression_ratio` to understand effectiveness
+    - Track `compression.rate` to understand effectiveness
     - Use response metrics to optimize prompt design
   </Accordion>
 
@@ -162,14 +264,15 @@ Here's what token compression means for your monthly AI bill:
 Every Edgee response includes detailed compression metrics:
 
 ```typescript
+// Usage information
 response.usage.prompt_tokens          // Compressed token count (billed)
-response.usage.prompt_tokens_original // Original token count (before compression)
-response.usage.saved_tokens           // Tokens saved by compression
-response.usage.compression_ratio      // Percentage reduction
 response.usage.completion_tokens      // Output tokens (unchanged)
 response.usage.total_tokens           // Total for billing calculation
 
-response.cost                         // Total request cost in USD
+// Compression information (when applied)
+response.compression.input_tokens     // Original token count (before compression)
+response.compression.saved_tokens     // Tokens saved by compression
+response.compression.rate             // Compression rate (0-1, e.g., 0.61 = 61%)
 ```
 
 Use these fields to:
diff --git a/images/compression-enabled-by-tag-dark.png b/images/compression-enabled-by-tag-dark.png
new file mode 100644
index 0000000..5687134
Binary files /dev/null and b/images/compression-enabled-by-tag-dark.png differ
diff --git a/images/compression-enabled-by-tag-light.png b/images/compression-enabled-by-tag-light.png
new file mode 100644
index 0000000..3ed6f53
Binary files /dev/null and b/images/compression-enabled-by-tag-light.png differ
diff --git a/images/compression-enabled-org-dark.png b/images/compression-enabled-org-dark.png
new file mode 100644
index 0000000..957011f
Binary files /dev/null and b/images/compression-enabled-org-dark.png differ
diff --git a/images/compression-enabled-org-light.png b/images/compression-enabled-org-light.png
new file mode 100644
index 0000000..7a1e36a
Binary files /dev/null and b/images/compression-enabled-org-light.png differ
diff --git a/integrations/anthropic-sdk.mdx b/integrations/anthropic-sdk.mdx
index e19169c..8c719c3 100644
--- a/integrations/anthropic-sdk.mdx
+++ b/integrations/anthropic-sdk.mdx
@@ -138,7 +138,7 @@ Stream responses for real-time token delivery:
 
 ## Cost Tracking & Compression
 
-Every Edgee response includes token compression metrics through the Anthropic API's `usage` field:
+Every Edgee response includes token compression metrics in a dedicated `compression` field:
 
 <Tabs>
   <Tab title="Python">
@@ -158,15 +158,13 @@ Every Edgee response includes token compression metrics through the Anthropic AP
 
     print(message.content[0].text)
 
-    # Compression metrics
-    usage = message.usage
-    tokens_saved = usage.input_tokens_original - usage.input_tokens
-    compression_ratio = (tokens_saved / usage.input_tokens_original) * 100
-
-    print(f"Original input tokens: {usage.input_tokens_original}")
-    print(f"Compressed input tokens: {usage.input_tokens}")
-    print(f"Tokens saved: {tokens_saved}")
-    print(f"Compression ratio: {compression_ratio:.1f}%")
+    # Compression metrics (if compression was applied)
+    if hasattr(message, 'compression') and message.compression:
+        compression = message.compression
+        print(f"Original input tokens: {compression.input_tokens}")
+        print(f"Compressed input tokens: {message.usage.input_tokens}")
+        print(f"Tokens saved: {compression.saved_tokens}")
+        print(f"Compression rate: {compression.rate * 100:.1f}%")
     ```
   </Tab>
 
@@ -187,21 +185,20 @@ Every Edgee response includes token compression metrics through the Anthropic AP
 
     console.log(message.content[0].text);
 
-    // Compression metrics
-    const usage = message.usage;
-    const tokensSaved = usage.input_tokens_original - usage.input_tokens;
-    const compressionRatio = (tokensSaved / usage.input_tokens_original) * 100;
-
-    console.log(`Original input tokens: ${usage.input_tokens_original}`);
-    console.log(`Compressed input tokens: ${usage.input_tokens}`);
-    console.log(`Tokens saved: ${tokensSaved}`);
-    console.log(`Compression ratio: ${compressionRatio.toFixed(1)}%`);
+    // Compression metrics (if compression was applied)
+    if (message.compression) {
+        const compression = message.compression;
+        console.log(`Original input tokens: ${compression.input_tokens}`);
+        console.log(`Compressed input tokens: ${message.usage.input_tokens}`);
+        console.log(`Tokens saved: ${compression.saved_tokens}`);
+        console.log(`Compression rate: ${(compression.rate * 100).toFixed(1)}%`);
+    }
     ```
   </Tab>
 </Tabs>
 
 <Note>
-  Edgee extends the Anthropic API response with `input_tokens_original` to show the token count before compression. All other fields remain standard Anthropic format.
+  Edgee extends the Anthropic API response with a `compression` field containing compression metrics (`input_tokens`, `saved_tokens`, `rate`). All standard Anthropic fields remain unchanged.
 </Note>
 
 ## Multi-Provider Access
diff --git a/integrations/openai-sdk.mdx b/integrations/openai-sdk.mdx
index 68ef2b7..88e040d 100644
--- a/integrations/openai-sdk.mdx
+++ b/integrations/openai-sdk.mdx
@@ -110,11 +110,12 @@ const completion = await openai.chat.completions.create({
 
 console.log(completion.choices[0].message.content);
 
-// Access compression metrics
-const usage = completion.usage;
-console.log(`Tokens saved: ${usage.prompt_tokens_original - usage.prompt_tokens}`);
-console.log(`Compression ratio: ${((usage.prompt_tokens_original - usage.prompt_tokens) / usage.prompt_tokens_original * 100).toFixed(1)}%`);
-console.log(`Total tokens: ${usage.total_tokens}`);
+// Access compression metrics (if compression was applied)
+if (completion.compression) {
+  console.log(`Tokens saved: ${completion.compression.saved_tokens}`);
+  console.log(`Compression rate: ${(completion.compression.rate * 100).toFixed(1)}%`);
+}
+console.log(`Total tokens: ${completion.usage.total_tokens}`);
 ```
 
 ```python title="Python"
@@ -135,20 +136,17 @@ completion = client.chat.completions.create(
 
 print(completion.choices[0].message.content)
 
-# Access compression metrics
-usage = completion.usage
-tokens_saved = usage.prompt_tokens_original - usage.prompt_tokens
-compression_ratio = (tokens_saved / usage.prompt_tokens_original) * 100
-
-print(f"Tokens saved: {tokens_saved}")
-print(f"Compression ratio: {compression_ratio:.1f}%")
-print(f"Total tokens: {usage.total_tokens}")
+# Access compression metrics (if compression was applied)
+if hasattr(completion, 'compression') and completion.compression:
+    print(f"Tokens saved: {completion.compression.saved_tokens}")
+    print(f"Compression rate: {completion.compression.rate * 100:.1f}%")
+print(f"Total tokens: {completion.usage.total_tokens}")
 ```
 
 </CodeGroup>
 
 <Note>
-  Edgee extends the OpenAI API response with `prompt_tokens_original` to show the token count before compression. All other fields remain standard OpenAI format.
+  Edgee extends the OpenAI API response with a `compression` field containing compression metrics (`input_tokens`, `saved_tokens`, `rate`). All standard OpenAI fields remain unchanged.
 </Note>
 
 ## Advanced Usage
diff --git a/introduction.mdx b/introduction.mdx
index 6fdc24d..07b6d52 100644
--- a/introduction.mdx
+++ b/introduction.mdx
@@ -22,8 +22,9 @@ Edgee is an **AI Gateway** that reduces LLM costs by up to 50% through intellige
     });
 
     console.log(response.text);
-    console.log(`Tokens saved: ${response.usage.saved_tokens}`);
-    console.log(`Cost: $${response.cost.toFixed(4)}`);
+    if (response.compression) {
+      console.log(`Tokens saved: ${response.compression.saved_tokens}`);
+    }
     ```
   </Tab>
   
@@ -39,8 +40,8 @@ Edgee is an **AI Gateway** that reduces LLM costs by up to 50% through intellige
     )
     
     print(response.text)
-    print(f"Tokens saved: {response.usage.saved_tokens}")
-    print(f"Cost: ${response.cost:.4f}")
+    if response.compression:
+        print(f"Tokens saved: {response.compression.saved_tokens}")
     ```
   </Tab>
   
@@ -63,8 +64,9 @@ Edgee is an **AI Gateway** that reduces LLM costs by up to 50% through intellige
         }
 
         fmt.Println(response.Text())
-        fmt.Printf("Tokens saved: %d\n", response.Usage.SavedTokens)
-        fmt.Printf("Cost: $%.4f\n", response.Cost)
+        if response.Compression != nil {
+            fmt.Printf("Tokens saved: %d\n", response.Compression.SavedTokens)
+        }
     }
     ```
   </Tab>
@@ -77,8 +79,9 @@ Edgee is an **AI Gateway** that reduces LLM costs by up to 50% through intellige
     let response = client.send("gpt-4o", "What is the capital of France?").await.unwrap();
 
     println!("{}", response.text().unwrap_or(""));
-    println!("Tokens saved: {}", response.usage.saved_tokens);
-    println!("Cost: ${:.4}", response.cost);
+    if let Some(compression) = &response.compression {
+        println!("Tokens saved: {}", compression.saved_tokens);
+    }
     ```
   </Tab>
 </Tabs>
diff --git a/introduction/faq.mdx b/introduction/faq.mdx
index 278fb09..c9e49c0 100644
--- a/introduction/faq.mdx
+++ b/introduction/faq.mdx
@@ -35,7 +35,7 @@ icon: message-circle-question-mark
     - Multi-turn conversations with growing history
     - Document analysis with redundant information
 
-    Every response includes compression metrics (`saved_tokens`, `compression_ratio`) so you can track your savings in real-time.
+    Every response includes a `compression` field with metrics (`input_tokens`, `saved_tokens`, `rate`) so you can track your savings in real-time.
   </Accordion>
 
   <Accordion title="What is Edgee?">
diff --git a/package-lock.json b/package-lock.json
index 87fdbb6..1a08bb4 100644
--- a/package-lock.json
+++ b/package-lock.json
@@ -5,7 +5,7 @@
   "packages": {
     "": {
       "dependencies": {
-        "mintlify": "^4.2.310"
+        "mintlify": "^4.2.314"
       }
     },
     "node_modules/@alcalzone/ansi-tokenize": {
@@ -85,9 +85,9 @@
       }
     },
     "node_modules/@babel/code-frame": {
-      "version": "7.28.6",
-      "resolved": "https://registry.npmjs.org/@babel/code-frame/-/code-frame-7.28.6.tgz",
-      "integrity": "sha512-JYgintcMjRiCvS8mMECzaEn+m3PfoQiyqukOMCCVQtoJGYJw8j/8LBJEiqkHLkfwCcs74E3pbAUFNg7d9VNJ+Q==",
+      "version": "7.29.0",
+      "resolved": "https://registry.npmjs.org/@babel/code-frame/-/code-frame-7.29.0.tgz",
+      "integrity": "sha512-9NhCeYjq9+3uxgdtp20LSiJXJvN0FeCtNGpJxuMFZ1Kv3cWUNb6DOhJwUvcVCzKGR66cw4njwM6hrJLqgOwbcw==",
       "license": "MIT",
       "dependencies": {
         "@babel/helper-validator-identifier": "^7.28.5",
@@ -982,18 +982,18 @@
       }
     },
     "node_modules/@mintlify/cli": {
-      "version": "4.0.914",
-      "resolved": "https://registry.npmjs.org/@mintlify/cli/-/cli-4.0.914.tgz",
-      "integrity": "sha512-L6Ls4qOedK0SkyZIBrfy8utQIt/fCyAX0sJO1yFoZjIPmGMYCSWHftIz9xXU2nWjV4924EoSg/b81+CBccZg6w==",
+      "version": "4.0.918",
+      "resolved": "https://registry.npmjs.org/@mintlify/cli/-/cli-4.0.918.tgz",
+      "integrity": "sha512-JwHE7Uhhog4xqbQmduuETZyWth/avASbGdDB1h9RhsprgIb7W8FR80YRRzUCCLUwz10M/dgqEaBEyr3/+RvKvg==",
       "license": "Elastic-2.0",
       "dependencies": {
         "@inquirer/prompts": "7.9.0",
-        "@mintlify/common": "1.0.694",
-        "@mintlify/link-rot": "3.0.853",
-        "@mintlify/models": "0.0.260",
-        "@mintlify/prebuild": "1.0.830",
-        "@mintlify/previewing": "4.0.886",
-        "@mintlify/validation": "0.1.574",
+        "@mintlify/common": "1.0.697",
+        "@mintlify/link-rot": "3.0.856",
+        "@mintlify/models": "0.0.262",
+        "@mintlify/prebuild": "1.0.833",
+        "@mintlify/previewing": "4.0.889",
+        "@mintlify/validation": "0.1.576",
         "adm-zip": "0.5.16",
         "chalk": "5.2.0",
         "color": "4.2.3",
@@ -1018,16 +1018,16 @@
       }
     },
     "node_modules/@mintlify/common": {
-      "version": "1.0.694",
-      "resolved": "https://registry.npmjs.org/@mintlify/common/-/common-1.0.694.tgz",
-      "integrity": "sha512-HiIL3+tZlFtrwcNHuIpjXQuBXWKIoVtIzPeFl08Pk88pLAHAphfCH2T4jBdEiaUPnx5BhDRko1HQ2T0uexRFeQ==",
+      "version": "1.0.697",
+      "resolved": "https://registry.npmjs.org/@mintlify/common/-/common-1.0.697.tgz",
+      "integrity": "sha512-xZ/arB2O60ncw+VPQg4jHqaY8huY2fhSTWvbSKSoJZyK+P7asMWXzNBCt0H6vcRf1rine4D2srlCA4ymCHnDHg==",
       "license": "ISC",
       "dependencies": {
         "@asyncapi/parser": "3.4.0",
         "@mintlify/mdx": "^3.0.4",
-        "@mintlify/models": "0.0.260",
+        "@mintlify/models": "0.0.262",
         "@mintlify/openapi-parser": "^0.0.8",
-        "@mintlify/validation": "0.1.574",
+        "@mintlify/validation": "0.1.576",
         "@sindresorhus/slugify": "2.2.0",
         "@types/mdast": "4.0.4",
         "acorn": "8.11.2",
@@ -1456,16 +1456,16 @@
       }
     },
     "node_modules/@mintlify/link-rot": {
-      "version": "3.0.853",
-      "resolved": "https://registry.npmjs.org/@mintlify/link-rot/-/link-rot-3.0.853.tgz",
-      "integrity": "sha512-zu6gr6RK7tY6WpD5/KJ3Q6zOgXlHfsJNHQp4F82cIaHDH8l4gnNGurnOW2RZcaLbDk+cj7UF3TrDbWT13RZgjg==",
+      "version": "3.0.856",
+      "resolved": "https://registry.npmjs.org/@mintlify/link-rot/-/link-rot-3.0.856.tgz",
+      "integrity": "sha512-4BFxaEJSqJtPM31zV+BD+bbXMnGWxYrtSa8ve3qFx7uSj1WYMVJ4Ag//xybmuzxPGlAN8vla8WXoLkz/6/gp3A==",
       "license": "Elastic-2.0",
       "dependencies": {
-        "@mintlify/common": "1.0.694",
-        "@mintlify/prebuild": "1.0.830",
-        "@mintlify/previewing": "4.0.886",
+        "@mintlify/common": "1.0.697",
+        "@mintlify/prebuild": "1.0.833",
+        "@mintlify/previewing": "4.0.889",
         "@mintlify/scraping": "4.0.522",
-        "@mintlify/validation": "0.1.574",
+        "@mintlify/validation": "0.1.576",
         "fs-extra": "11.1.0",
         "unist-util-visit": "4.1.2"
       },
@@ -1536,9 +1536,9 @@
       }
     },
     "node_modules/@mintlify/models": {
-      "version": "0.0.260",
-      "resolved": "https://registry.npmjs.org/@mintlify/models/-/models-0.0.260.tgz",
-      "integrity": "sha512-M7WpKC4ysrrc5M16fUPFBLbhmdxfOm3LsMeurhQJ7Jc4V8o8DCdqLKkGTs0PZEFPSKx34X1wCBp4YrDx3kBDNQ==",
+      "version": "0.0.262",
+      "resolved": "https://registry.npmjs.org/@mintlify/models/-/models-0.0.262.tgz",
+      "integrity": "sha512-9JNwnx1AtasQi3eP3yh/ffNgAB5ZS17jSE0IPa38QzBn4eMXoLvNQscPlhBp9krAYHpnOWm8VN0G5rV9lsgXvA==",
       "license": "Elastic-2.0",
       "dependencies": {
         "axios": "1.13.2",
@@ -1583,15 +1583,15 @@
       }
     },
     "node_modules/@mintlify/prebuild": {
-      "version": "1.0.830",
-      "resolved": "https://registry.npmjs.org/@mintlify/prebuild/-/prebuild-1.0.830.tgz",
-      "integrity": "sha512-q696zAc5TvhKFddaIuygM8W20y/3J9D2R/EVsmzKmio9+IfWkHfY/h6WwfqVavHpYdgnFxh/sAZtA4pSb609QQ==",
+      "version": "1.0.833",
+      "resolved": "https://registry.npmjs.org/@mintlify/prebuild/-/prebuild-1.0.833.tgz",
+      "integrity": "sha512-RAnnVDplb1pdY1VzZDoJPd+unRKs0QJo/rrzMPw3ytf69iOS+Z3Ao00CSLPJm3ZQ65HOj0XFZ5qIfJHaoT6jpA==",
       "license": "Elastic-2.0",
       "dependencies": {
-        "@mintlify/common": "1.0.694",
+        "@mintlify/common": "1.0.697",
         "@mintlify/openapi-parser": "^0.0.8",
-        "@mintlify/scraping": "4.0.555",
-        "@mintlify/validation": "0.1.574",
+        "@mintlify/scraping": "4.0.558",
+        "@mintlify/validation": "0.1.576",
         "chalk": "5.3.0",
         "favicons": "7.2.0",
         "front-matter": "4.0.2",
@@ -1605,12 +1605,12 @@
       }
     },
     "node_modules/@mintlify/prebuild/node_modules/@mintlify/scraping": {
-      "version": "4.0.555",
-      "resolved": "https://registry.npmjs.org/@mintlify/scraping/-/scraping-4.0.555.tgz",
-      "integrity": "sha512-YhxnlyirsKy4huUdUVBcPuPrIymbnu+hR9a9x0sullh7VKGEwPgxo0a0bqGSPdkoGJMBlXlJDOGLb2Ud0/gsdQ==",
+      "version": "4.0.558",
+      "resolved": "https://registry.npmjs.org/@mintlify/scraping/-/scraping-4.0.558.tgz",
+      "integrity": "sha512-CR8CBwrdcr4pQ3EHCLjuK0oet0Ag4Samwqpha3fGZ3FYad2Kaq1ZON7x89GosO4DiZTGQksBv2A9gpOVs0vpzg==",
       "license": "Elastic-2.0",
       "dependencies": {
-        "@mintlify/common": "1.0.694",
+        "@mintlify/common": "1.0.697",
         "@mintlify/openapi-parser": "^0.0.8",
         "fs-extra": "11.1.1",
         "hast-util-to-mdast": "10.1.0",
@@ -1786,14 +1786,14 @@
       }
     },
     "node_modules/@mintlify/previewing": {
-      "version": "4.0.886",
-      "resolved": "https://registry.npmjs.org/@mintlify/previewing/-/previewing-4.0.886.tgz",
-      "integrity": "sha512-WScIiiw/6QYm7CMSdfKqpW9skKDl/rw5eAk1FOzBlUdOWpVZhQ77OuWyltWaD2MepWw4Cj7UBbcYyDvsvPtyEA==",
+      "version": "4.0.889",
+      "resolved": "https://registry.npmjs.org/@mintlify/previewing/-/previewing-4.0.889.tgz",
+      "integrity": "sha512-SvVM+lzkDK2LM6G/d5mKMpIA1Kf3nlLzRhwJLDLdI01+Hq2uzKmyQvmWGWc4vqTTIZwTSNiYumbEzaMbzJx9uA==",
       "license": "Elastic-2.0",
       "dependencies": {
-        "@mintlify/common": "1.0.694",
-        "@mintlify/prebuild": "1.0.830",
-        "@mintlify/validation": "0.1.574",
+        "@mintlify/common": "1.0.697",
+        "@mintlify/prebuild": "1.0.833",
+        "@mintlify/validation": "0.1.576",
         "better-opn": "3.0.2",
         "chalk": "5.2.0",
         "chokidar": "3.5.3",
@@ -2433,13 +2433,13 @@
       }
     },
     "node_modules/@mintlify/validation": {
-      "version": "0.1.574",
-      "resolved": "https://registry.npmjs.org/@mintlify/validation/-/validation-0.1.574.tgz",
-      "integrity": "sha512-yqBiZJmP+7iHPiJ2h40MWO94f02Ajo/7PDve+V+mrWcw6OwetJErUdlYFsveiiPYk66HIM6BvFQBsQZt1NfSqQ==",
+      "version": "0.1.576",
+      "resolved": "https://registry.npmjs.org/@mintlify/validation/-/validation-0.1.576.tgz",
+      "integrity": "sha512-w3QWe2X2gj6oqA0jCTQialxIQz/ki+6ud0V0Y+gKAlLH5UsoIqBFCrXjFYZZLkBt/Md6wnNmADH71gjLh4481w==",
       "license": "Elastic-2.0",
       "dependencies": {
         "@mintlify/mdx": "^3.0.4",
-        "@mintlify/models": "0.0.260",
+        "@mintlify/models": "0.0.262",
         "arktype": "2.1.27",
         "js-yaml": "4.1.0",
         "lcm": "0.0.3",
@@ -3144,12 +3144,12 @@
       "license": "MIT"
     },
     "node_modules/@shikijs/core": {
-      "version": "3.21.0",
-      "resolved": "https://registry.npmjs.org/@shikijs/core/-/core-3.21.0.tgz",
-      "integrity": "sha512-AXSQu/2n1UIQekY8euBJlvFYZIw0PHY63jUzGbrOma4wPxzznJXTXkri+QcHeBNaFxiiOljKxxJkVSoB3PjbyA==",
+      "version": "3.22.0",
+      "resolved": "https://registry.npmjs.org/@shikijs/core/-/core-3.22.0.tgz",
+      "integrity": "sha512-iAlTtSDDbJiRpvgL5ugKEATDtHdUVkqgHDm/gbD2ZS9c88mx7G1zSYjjOxp5Qa0eaW0MAQosFRmJSk354PRoQA==",
       "license": "MIT",
       "dependencies": {
-        "@shikijs/types": "3.21.0",
+        "@shikijs/types": "3.22.0",
         "@shikijs/vscode-textmate": "^10.0.2",
         "@types/hast": "^3.0.4",
         "hast-util-to-html": "^9.0.5"
@@ -3179,62 +3179,62 @@
       }
     },
     "node_modules/@shikijs/engine-javascript": {
-      "version": "3.21.0",
-      "resolved": "https://registry.npmjs.org/@shikijs/engine-javascript/-/engine-javascript-3.21.0.tgz",
-      "integrity": "sha512-ATwv86xlbmfD9n9gKRiwuPpWgPENAWCLwYCGz9ugTJlsO2kOzhOkvoyV/UD+tJ0uT7YRyD530x6ugNSffmvIiQ==",
+      "version": "3.22.0",
+      "resolved": "https://registry.npmjs.org/@shikijs/engine-javascript/-/engine-javascript-3.22.0.tgz",
+      "integrity": "sha512-jdKhfgW9CRtj3Tor0L7+yPwdG3CgP7W+ZEqSsojrMzCjD1e0IxIbwUMDDpYlVBlC08TACg4puwFGkZfLS+56Tw==",
       "license": "MIT",
       "dependencies": {
-        "@shikijs/types": "3.21.0",
+        "@shikijs/types": "3.22.0",
         "@shikijs/vscode-textmate": "^10.0.2",
         "oniguruma-to-es": "^4.3.4"
       }
     },
     "node_modules/@shikijs/engine-oniguruma": {
-      "version": "3.21.0",
-      "resolved": "https://registry.npmjs.org/@shikijs/engine-oniguruma/-/engine-oniguruma-3.21.0.tgz",
-      "integrity": "sha512-OYknTCct6qiwpQDqDdf3iedRdzj6hFlOPv5hMvI+hkWfCKs5mlJ4TXziBG9nyabLwGulrUjHiCq3xCspSzErYQ==",
+      "version": "3.22.0",
+      "resolved": "https://registry.npmjs.org/@shikijs/engine-oniguruma/-/engine-oniguruma-3.22.0.tgz",
+      "integrity": "sha512-DyXsOG0vGtNtl7ygvabHd7Mt5EY8gCNqR9Y7Lpbbd/PbJvgWrqaKzH1JW6H6qFkuUa8aCxoiYVv8/YfFljiQxA==",
       "license": "MIT",
       "dependencies": {
-        "@shikijs/types": "3.21.0",
+        "@shikijs/types": "3.22.0",
         "@shikijs/vscode-textmate": "^10.0.2"
       }
     },
     "node_modules/@shikijs/langs": {
-      "version": "3.21.0",
-      "resolved": "https://registry.npmjs.org/@shikijs/langs/-/langs-3.21.0.tgz",
-      "integrity": "sha512-g6mn5m+Y6GBJ4wxmBYqalK9Sp0CFkUqfNzUy2pJglUginz6ZpWbaWjDB4fbQ/8SHzFjYbtU6Ddlp1pc+PPNDVA==",
+      "version": "3.22.0",
+      "resolved": "https://registry.npmjs.org/@shikijs/langs/-/langs-3.22.0.tgz",
+      "integrity": "sha512-x/42TfhWmp6H00T6uwVrdTJGKgNdFbrEdhaDwSR5fd5zhQ1Q46bHq9EO61SCEWJR0HY7z2HNDMaBZp8JRmKiIA==",
       "license": "MIT",
       "dependencies": {
-        "@shikijs/types": "3.21.0"
+        "@shikijs/types": "3.22.0"
       }
     },
     "node_modules/@shikijs/themes": {
-      "version": "3.21.0",
-      "resolved": "https://registry.npmjs.org/@shikijs/themes/-/themes-3.21.0.tgz",
-      "integrity": "sha512-BAE4cr9EDiZyYzwIHEk7JTBJ9CzlPuM4PchfcA5ao1dWXb25nv6hYsoDiBq2aZK9E3dlt3WB78uI96UESD+8Mw==",
+      "version": "3.22.0",
+      "resolved": "https://registry.npmjs.org/@shikijs/themes/-/themes-3.22.0.tgz",
+      "integrity": "sha512-o+tlOKqsr6FE4+mYJG08tfCFDS+3CG20HbldXeVoyP+cYSUxDhrFf3GPjE60U55iOkkjbpY2uC3It/eeja35/g==",
       "license": "MIT",
       "dependencies": {
-        "@shikijs/types": "3.21.0"
+        "@shikijs/types": "3.22.0"
       }
     },
     "node_modules/@shikijs/transformers": {
-      "version": "3.21.0",
-      "resolved": "https://registry.npmjs.org/@shikijs/transformers/-/transformers-3.21.0.tgz",
-      "integrity": "sha512-CZwvCWWIiRRiFk9/JKzdEooakAP8mQDtBOQ1TKiCaS2E1bYtyBCOkUzS8akO34/7ufICQ29oeSfkb3tT5KtrhA==",
+      "version": "3.22.0",
+      "resolved": "https://registry.npmjs.org/@shikijs/transformers/-/transformers-3.22.0.tgz",
+      "integrity": "sha512-E7eRV7mwDBjueLF6852n2oYeJYxBq3NSsDk+uyruYAXONv4U8holGmIrT+mPRJQ1J1SNOH6L8G19KRzmBawrFw==",
       "license": "MIT",
       "dependencies": {
-        "@shikijs/core": "3.21.0",
-        "@shikijs/types": "3.21.0"
+        "@shikijs/core": "3.22.0",
+        "@shikijs/types": "3.22.0"
       }
     },
     "node_modules/@shikijs/twoslash": {
-      "version": "3.21.0",
-      "resolved": "https://registry.npmjs.org/@shikijs/twoslash/-/twoslash-3.21.0.tgz",
-      "integrity": "sha512-iH360udAYON2JwfIldoCiMZr9MljuQA5QRBivKLpEuEpmVCSwrR+0WTQ0eS1ptgGBdH9weFiIsA5wJDzsEzTYg==",
+      "version": "3.22.0",
+      "resolved": "https://registry.npmjs.org/@shikijs/twoslash/-/twoslash-3.22.0.tgz",
+      "integrity": "sha512-GO27UPN+kegOMQvC+4XcLt0Mttyg+n16XKjmoKjdaNZoW+sOJV7FLdv2QKauqUDws6nE3EQPD+TFHEdyyoUBDw==",
       "license": "MIT",
       "dependencies": {
-        "@shikijs/core": "3.21.0",
-        "@shikijs/types": "3.21.0",
+        "@shikijs/core": "3.22.0",
+        "@shikijs/types": "3.22.0",
         "twoslash": "^0.3.6"
       },
       "peerDependencies": {
@@ -3242,9 +3242,9 @@
       }
     },
     "node_modules/@shikijs/types": {
-      "version": "3.21.0",
-      "resolved": "https://registry.npmjs.org/@shikijs/types/-/types-3.21.0.tgz",
-      "integrity": "sha512-zGrWOxZ0/+0ovPY7PvBU2gIS9tmhSUUt30jAcNV0Bq0gb2S98gwfjIs1vxlmH5zM7/4YxLamT6ChlqqAJmPPjA==",
+      "version": "3.22.0",
+      "resolved": "https://registry.npmjs.org/@shikijs/types/-/types-3.22.0.tgz",
+      "integrity": "sha512-491iAekgKDBFE67z70Ok5a8KBMsQ2IJwOWw3us/7ffQkIBCyOQfm/aNwVMBUriP02QshIfgHCBSIYAl3u2eWjg==",
       "license": "MIT",
       "dependencies": {
         "@shikijs/vscode-textmate": "^10.0.2",
@@ -3798,9 +3798,9 @@
       }
     },
     "node_modules/@types/node": {
-      "version": "25.1.0",
-      "resolved": "https://registry.npmjs.org/@types/node/-/node-25.1.0.tgz",
-      "integrity": "sha512-t7frlewr6+cbx+9Ohpl0NOTKXZNV9xHRmNOvql47BFJKcEG1CxtxlPEEe+gR9uhVWM4DwhnvTF110mIL4yP9RA==",
+      "version": "25.2.0",
+      "resolved": "https://registry.npmjs.org/@types/node/-/node-25.2.0.tgz",
+      "integrity": "sha512-DZ8VwRFUNzuqJ5khrvwMXHmvPe+zGayJhr2CDNiKB1WBE1ST8Djl00D0IC4vvNmHMdj6DlbYRIaFE7WHjlDl5w==",
       "license": "MIT",
       "peer": true,
       "dependencies": {
@@ -9557,12 +9557,12 @@
       }
     },
     "node_modules/mintlify": {
-      "version": "4.2.310",
-      "resolved": "https://registry.npmjs.org/mintlify/-/mintlify-4.2.310.tgz",
-      "integrity": "sha512-1FBt5x1BSk0z+Shk4owGfDpEZhXsN68me1qNSZ8F6LyEXFuflADNGpQUYX36GGeGrbZoDbY76WLtiQLv7XPmtg==",
+      "version": "4.2.314",
+      "resolved": "https://registry.npmjs.org/mintlify/-/mintlify-4.2.314.tgz",
+      "integrity": "sha512-G7KSCSkPyNq377ooSSQomggCDWsDoEeG+SyB1K2eKW5sH0SnSKyaCdPYxMLJF05+lGgYGSmPJ3mGqgqQC64/Rg==",
       "license": "Elastic-2.0",
       "dependencies": {
-        "@mintlify/cli": "4.0.914"
+        "@mintlify/cli": "4.0.918"
       },
       "bin": {
         "mint": "index.js",
@@ -11528,17 +11528,17 @@
       }
     },
     "node_modules/shiki": {
-      "version": "3.21.0",
-      "resolved": "https://registry.npmjs.org/shiki/-/shiki-3.21.0.tgz",
-      "integrity": "sha512-N65B/3bqL/TI2crrXr+4UivctrAGEjmsib5rPMMPpFp1xAx/w03v8WZ9RDDFYteXoEgY7qZ4HGgl5KBIu1153w==",
+      "version": "3.22.0",
+      "resolved": "https://registry.npmjs.org/shiki/-/shiki-3.22.0.tgz",
+      "integrity": "sha512-LBnhsoYEe0Eou4e1VgJACes+O6S6QC0w71fCSp5Oya79inkwkm15gQ1UF6VtQ8j/taMDh79hAB49WUk8ALQW3g==",
       "license": "MIT",
       "dependencies": {
-        "@shikijs/core": "3.21.0",
-        "@shikijs/engine-javascript": "3.21.0",
-        "@shikijs/engine-oniguruma": "3.21.0",
-        "@shikijs/langs": "3.21.0",
-        "@shikijs/themes": "3.21.0",
-        "@shikijs/types": "3.21.0",
+        "@shikijs/core": "3.22.0",
+        "@shikijs/engine-javascript": "3.22.0",
+        "@shikijs/engine-oniguruma": "3.22.0",
+        "@shikijs/langs": "3.22.0",
+        "@shikijs/themes": "3.22.0",
+        "@shikijs/types": "3.22.0",
         "@shikijs/vscode-textmate": "^10.0.2",
         "@types/hast": "^3.0.4"
       }
diff --git a/package.json b/package.json
index b42ef1b..45ccb81 100644
--- a/package.json
+++ b/package.json
@@ -4,6 +4,6 @@
     "links": "mintlify broken-links"
   },
   "dependencies": {
-    "mintlify": "^4.2.310"
+    "mintlify": "^4.2.314"
   }
 }
diff --git a/sdk/go/send.mdx b/sdk/go/send.mdx
index ea3198a..91f8b9f 100644
--- a/sdk/go/send.mdx
+++ b/sdk/go/send.mdx
@@ -43,6 +43,8 @@ When `input` is an `InputObject`, you have full control over the conversation:
 | `Tools` | `[]Tool` | Array of function tools available to the model |
 | `ToolChoice` | `any` | Controls which tool (if any) the model should call. Can be `string` (`"auto"`, `"none"`) or `map[string]interface{}`. See [Tools documentation](/sdk/go/tools) for details |
 | `Tags` | `[]string` | Optional tags to categorize and label the request for analytics and filtering. Can also be sent via the `x-edgee-tags` header (comma-separated) |
+| `EnableCompression` | `bool` | Enable token compression for this request. If `true`, the request will be compressed to the compression rate specified in the API key settings. If `false`, the request will not be compressed. |
+| `CompressionRate` | `float64` | The compression rate to use for this request. If `EnableCompression` is `true`, this value will be used to compress the request. The value should be between 0.0 and 1.0. The default value is 0.75. |
 
 **Example with InputObject:**
 
@@ -151,6 +153,7 @@ The `Send()` method returns `(SendResponse, error)`. On success, the `SendRespon
 | `Model` | `string` | Model identifier used for the completion |
 | `Choices` | `[]Choice` | Array of completion choices (typically one) |
 | `Usage` | `*Usage` | Token usage information (if provided by the API) |
+| `Compression` | `*Compression` | Token compression metrics (if compression was applied) |
 
 ### Choice Object
 
@@ -195,7 +198,7 @@ Token usage information (when available):
 
 | Property | Type | Description |
 |----------|------|-------------|
-| `PromptTokens` | `int` | Number of tokens in the prompt |
+| `PromptTokens` | `int` | Number of tokens in the prompt (after compression if applied) |
 | `CompletionTokens` | `int` | Number of tokens in the completion |
 | `TotalTokens` | `int` | Total tokens used (prompt + completion) |
 
@@ -214,6 +217,41 @@ if response.Usage != nil {
 }
 ```
 
+### Compression Object
+
+Token compression metrics (when compression is applied):
+
+| Property | Type | Description |
+|----------|------|-------------|
+| `InputTokens` | `int` | Original number of input tokens before compression |
+| `SavedTokens` | `int` | Number of tokens saved by compression |
+| `Rate` | `float64` | Compression rate as a decimal (0-1). For example, `0.61` means 61% compression |
+
+**Example - Accessing Compression Metrics:**
+
+```go
+response, err := client.Send("gpt-4o", edgee.InputObject{
+    Messages: []edgee.Message{
+        {Role: "user", Content: "Analyze this long document with lots of context..."},
+    },
+    EnableCompression: true,
+    CompressionRate: 0.8,
+})
+if err != nil {
+    log.Fatal(err)
+}
+
+if response.Compression != nil {
+    fmt.Printf("Original input tokens: %d\n", response.Compression.InputTokens)
+    fmt.Printf("Tokens saved: %d\n", response.Compression.SavedTokens)
+    fmt.Printf("Compression rate: %.1f%%\n", response.Compression.Rate * 100)
+}
+```
+
+<Note>
+  The `Compression` object is only present when token compression is applied to the request. Simple queries may not trigger compression.
+</Note>
+
 ## Convenience Methods
 
 The `SendResponse` struct provides convenience methods for easier access:
diff --git a/sdk/go/stream.mdx b/sdk/go/stream.mdx
index e6d0690..61f07e1 100644
--- a/sdk/go/stream.mdx
+++ b/sdk/go/stream.mdx
@@ -57,6 +57,8 @@ When `input` is an `InputObject` or `map[string]interface{}`, you have full cont
 | `Tools` | `[]Tool` | Array of function tools available to the model |
 | `ToolChoice` | `any` | Controls which tool (if any) the model should call. See [Tools documentation](/sdk/go/tools) for details |
 | `Tags` | `[]string` | Optional tags to categorize and label the request for analytics and filtering. Can also be sent via the `x-edgee-tags` header (comma-separated) |
+| `EnableCompression` | `bool` | Enable token compression for this request. If `true`, the request will be compressed to the compression rate specified in the API key settings. If `false`, the request will not be compressed. |
+| `CompressionRate` | `float64` | The compression rate to use for this request. If `EnableCompression` is `true`, this value will be used to compress the request. The value should be between 0.0 and 1.0. The default value is 0.75. |
 
 For details about `Message` type, see the [Send Method documentation](/sdk/go/send#message-object). 
 For details about `Tool` and `ToolChoice` types, see the [Tools documentation](/sdk/go/tools).
@@ -110,6 +112,7 @@ Each chunk received from the channel has the following structure:
 | `Created` | `int64` | Unix timestamp of when the chunk was created |
 | `Model` | `string` | Model identifier used for the completion |
 | `Choices` | `[]StreamChoice` | Array of streaming choices (typically one) |
+| `Compression` | `*Compression` | Token compression metrics (if compression was applied) |
 
 ### StreamChoice Object
 
diff --git a/sdk/index.mdx b/sdk/index.mdx
index 60e4262..f8d71d6 100644
--- a/sdk/index.mdx
+++ b/sdk/index.mdx
@@ -4,7 +4,7 @@ description: Choose your SDK and start using Edgee.
 icon: boxes
 ---
 
-Edgee provides official SDKs for TypeScript, Python, Go, and Rust. All SDKs offer a consistent, type-safe interface to interact with the Edgee AI Gateway, supporting OpenAI-compatible chat completions and function calling.
+Edgee provides official SDKs for TypeScript, Python, Go, and Rust. All SDKs offer a consistent, type-safe interface to interact with the Edgee AI Gateway, supporting OpenAI-compatible chat completions and function calling. All SDKs include built-in support for **token compression** and **cost tracking**, automatically reducing your LLM spend by up to 50%.
 
 ## Quick Start
 
@@ -27,7 +27,9 @@ Choose your language and get started in minutes:
     });
     
     console.log(response.text);
-    // "The capital of France is Paris."
+    if (response.compression) {
+      console.log(`Tokens saved: ${response.compression.saved_tokens}`);
+    }
     ```
   </Tab>
   
@@ -47,7 +49,8 @@ Choose your language and get started in minutes:
     )
     
     print(response.text)
-    # "The capital of France is Paris."
+    if response.compression:
+        print(f"Tokens saved: {response.compression.saved_tokens}")
     ```
   </Tab>
   
@@ -74,7 +77,9 @@ Choose your language and get started in minutes:
         }
 
         fmt.Println(response.Text())
-        // "The capital of France is Paris."
+        if response.Compression != nil {
+            fmt.Printf("Tokens saved: %d\n", response.Compression.SavedTokens)
+        }
     }
     ```
   </Tab>
@@ -91,7 +96,9 @@ Choose your language and get started in minutes:
     let response = client.send("gpt-4o", "What is the capital of France?").await.unwrap();
 
     println!("{}", response.text().unwrap_or(""));
-    // "The capital of France is Paris."
+    if let Some(compression) = &response.compression {
+        println!("Tokens saved: {}", compression.saved_tokens);
+    }
     ```
   </Tab>
 </Tabs>
@@ -100,12 +107,14 @@ Choose your language and get started in minutes:
 
 All SDKs provide consistent functionality:
 
+- **Token Compression**: Automatic prompt compression with savings reporting
+- **Compression Metrics**: Real-time token savings and compression rate for every request
 - **OpenAI-compatible API**: Use familiar patterns across all languages
 - **Function Calling**: Full support for tool/function calling
 - **Type Safety**: Strong typing and autocomplete support
 - **Error Handling**: Comprehensive error handling and validation
 - **Environment Variables**: Support for `EDGEE_API_KEY`
-- **Token Usage**: Access to prompt, completion, and total token counts
+- **Token Usage**: Access to prompt, completion, saved tokens, and compression ratios
 
 To learn more about the SDKs, see the individual SDK pages:
 
diff --git a/sdk/python/send.mdx b/sdk/python/send.mdx
index 20c71cd..9e28595 100644
--- a/sdk/python/send.mdx
+++ b/sdk/python/send.mdx
@@ -42,6 +42,8 @@ When `input` is an `InputObject` or dictionary, you have full control over the c
 | `tools` | `list[dict] \| None` | Array of function tools available to the model |
 | `tool_choice` | `str \| dict \| None` | Controls which tool (if any) the model should call. See [Tools documentation](/sdk/python/tools) for details |
 | `tags` | `list[str] \| None` | Optional tags to categorize and label the request for analytics and filtering. Can also be sent via the `x-edgee-tags` header (comma-separated) |
+| `enable_compression` | `bool` | Enable token compression for this request. If `true`, the request will be compressed to the compression rate specified in the API key settings. If `false`, the request will not be compressed. |
+| `compression_rate` | `float` | The compression rate to use for this request. If `enable_compression` is `true`, this value will be used to compress the request. The value should be between 0.0 and 1.0. The default value is 0.75. |
 
 **Example with Dictionary Input:**
 
@@ -124,6 +126,7 @@ The `send()` method returns a `SendResponse` object when `stream=False` (default
 |----------|------|-------------|
 | `choices` | `list[Choice]` | Array of completion choices (typically one) |
 | `usage` | `Usage \| None` | Token usage information (if provided by the API) |
+| `compression` | `Compression \| None` | Token compression metrics (if compression was applied) |
 
 ### Choice Object
 
@@ -165,7 +168,7 @@ Token usage information (when available):
 
 | Property | Type | Description |
 |----------|------|-------------|
-| `prompt_tokens` | `int` | Number of tokens in the prompt |
+| `prompt_tokens` | `int` | Number of tokens in the prompt (after compression if applied) |
 | `completion_tokens` | `int` | Number of tokens in the completion |
 | `total_tokens` | `int` | Total tokens used (prompt + completion) |
 
@@ -183,6 +186,40 @@ if response.usage:
     print(f"Total tokens: {response.usage.total_tokens}")
 ```
 
+### Compression Object
+
+Token compression metrics (when compression is applied):
+
+| Property | Type | Description |
+|----------|------|-------------|
+| `input_tokens` | `int` | Original number of input tokens before compression |
+| `saved_tokens` | `int` | Number of tokens saved by compression |
+| `rate` | `float` | Compression rate as a decimal (0-1). For example, `0.61` means 61% compression |
+
+**Example - Accessing Compression Metrics:**
+
+```python
+response = edgee.send(
+    model="gpt-4o",
+    input={
+        "messages": [
+            {"role": "user", "content": "Analyze this long document with lots of context..."}
+        ],
+        "enable_compression": True,
+        "compression_rate": 0.8
+    }
+)
+
+if response.compression:
+    print(f"Original input tokens: {response.compression.input_tokens}")
+    print(f"Tokens saved: {response.compression.saved_tokens}")
+    print(f"Compression rate: {response.compression.rate * 100:.1f}%")
+```
+
+<Note>
+  The `compression` object is only present when token compression is applied to the request. Simple queries may not trigger compression.
+</Note>
+
 ## Convenience Properties
 
 The `SendResponse` class provides convenience properties for easier access:
diff --git a/sdk/python/stream.mdx b/sdk/python/stream.mdx
index 752b2b4..ec7f3d6 100644
--- a/sdk/python/stream.mdx
+++ b/sdk/python/stream.mdx
@@ -40,6 +40,8 @@ When `input` is an `InputObject` or dictionary, you have full control over the c
 | `tools` | `list[dict] \| None` | Array of function tools available to the model |
 | `tool_choice` | `str \| dict \| None` | Controls which tool (if any) the model should call. See [Tools documentation](/sdk/python/tools) for details |
 | `tags` | `list[str] \| None` | Optional tags to categorize and label the request for analytics and filtering. Can also be sent via the `x-edgee-tags` header (comma-separated) |
+| `enable_compression` | `bool` | Enable token compression for this request. If `true`, the request will be compressed to the compression rate specified in the API key settings. If `false`, the request will not be compressed. |
+| `compression_rate` | `float` | The compression rate to use for this request. If `enable_compression` is `true`, this value will be used to compress the request. The value should be between 0.0 and 1.0. The default value is 0.75. |
 
 For details about `Message` type, see the [Send Method documentation](/sdk/python/send#message-object). 
 For details about `Tool` and `ToolChoice` types, see the [Tools documentation](/sdk/python/tools).
@@ -68,6 +70,7 @@ Each chunk yielded by the generator has the following structure:
 | Property | Type | Description |
 |----------|------|-------------|
 | `choices` | `list[StreamChoice]` | Array of streaming choices (typically one) |
+| `compression` | `Compression \| None` | Token compression metrics (if compression was applied) |
 
 ### StreamChoice Object
 
diff --git a/sdk/rust/send.mdx b/sdk/rust/send.mdx
index 703c9e1..99f7ad4 100644
--- a/sdk/rust/send.mdx
+++ b/sdk/rust/send.mdx
@@ -57,6 +57,8 @@ When `input` is an `InputObject`, you have full control over the conversation:
 | `tools` | `Option<Vec<Tool>>` | Array of function tools available to the model |
 | `tool_choice` | `Option<serde_json::Value>` | Controls which tool (if any) the model should call. See [Tools documentation](/sdk/rust/tools) for details |
 | `tags` | `Option<Vec<String>>` | Optional tags to categorize and label the request for analytics and filtering. Can also be sent via the `x-edgee-tags` header (comma-separated) |
+| `enable_compression` | `Option<bool>` | Enable token compression for this request (overrides console settings). If not set, uses the configuration from your API key or organization settings |
+| `compression_rate` | `Option<f64>` | Target compression rate (0.0-1.0, default 0.75). Only used if compression is enabled. Higher values attempt more aggressive compression |
 
 **Example with InputObject:**
 
@@ -142,6 +144,7 @@ The `send()` method returns a `Result<SendResponse>`. On success, it contains:
 | `model` | `String` | Model identifier used for the completion |
 | `choices` | `Vec<Choice>` | Array of completion choices (typically one) |
 | `usage` | `Option<Usage>` | Token usage information (if provided by the API) |
+| `compression` | `Option<Compression>` | Token compression metrics (if compression was applied) |
 
 ### Choice Object
 
@@ -181,7 +184,7 @@ Token usage information (when available):
 
 | Property | Type | Description |
 |----------|------|-------------|
-| `prompt_tokens` | `u32` | Number of tokens in the prompt |
+| `prompt_tokens` | `u32` | Number of tokens in the prompt (after compression if applied) |
 | `completion_tokens` | `u32` | Number of tokens in the completion |
 | `total_tokens` | `u32` | Total tokens used (prompt + completion) |
 
@@ -197,6 +200,39 @@ if let Some(usage) = &response.usage {
 }
 ```
 
+### Compression Object
+
+Token compression metrics (when compression is applied):
+
+| Property | Type | Description |
+|----------|------|-------------|
+| `input_tokens` | `u32` | Original number of input tokens before compression |
+| `saved_tokens` | `u32` | Number of tokens saved by compression |
+| `rate` | `f64` | Compression rate as a decimal (0-1). For example, `0.61` means 61% compression |
+
+**Example - Accessing Compression Metrics:**
+
+```rust
+let input = InputObject::new(vec![
+    Message::user("Analyze this long document with lots of context...")
+])
+.with_enable_compression(true)
+.with_compression_rate(0.8); // Target 80% compression
+
+let response = client.send("gpt-4o", input).await?;
+println!("{}", response.text().unwrap_or(""));
+
+if let Some(compression) = &response.compression {
+    println!("Original input tokens: {}", compression.input_tokens);
+    println!("Tokens saved: {}", compression.saved_tokens);
+    println!("Compression rate: {:.1}%", compression.rate * 100.0);
+}
+```
+
+<Note>
+  The `compression` object is only present when token compression is applied to the request. Simple queries may not trigger compression.
+</Note>
+
 ## Convenience Methods
 
 The `SendResponse` struct provides convenience methods for easier access:
diff --git a/sdk/rust/stream.mdx b/sdk/rust/stream.mdx
index f4a1ba1..4581761 100644
--- a/sdk/rust/stream.mdx
+++ b/sdk/rust/stream.mdx
@@ -55,6 +55,8 @@ When `input` is a `Vec<Message>` or `InputObject`, you have full control over th
 | `tools` | `Option<Vec<Tool>>` | Array of function tools available to the model |
 | `tool_choice` | `Option<serde_json::Value>` | Controls which tool (if any) the model should call. See [Tools documentation](/sdk/rust/tools) for details |
 | `tags` | `Option<Vec<String>>` | Optional tags to categorize and label the request for analytics and filtering. Can also be sent via the `x-edgee-tags` header (comma-separated) |
+| `enable_compression` | `Option<bool>` | Enable token compression for this request (overrides console settings). If not set, uses the configuration from your API key or organization settings |
+| `compression_rate` | `Option<f64>` | Target compression rate (0.0-1.0, default 0.75). Only used if compression is enabled. Higher values attempt more aggressive compression |
 
 For details about `Message` type, see the [Send Method documentation](/sdk/rust/send#message-object). 
 For details about `Tool` and `ToolChoice` types, see the [Tools documentation](/sdk/rust/tools).
@@ -96,6 +98,7 @@ Each chunk yielded by the stream has the following structure:
 | `created` | `u64` | Unix timestamp of when the chunk was created |
 | `model` | `String` | Model identifier used for the completion |
 | `choices` | `Vec<StreamChoice>` | Array of streaming choices (typically one) |
+| `compression` | `Option<Compression>` | Token compression metrics (if compression was applied) |
 
 ### StreamChoice Object
 
diff --git a/sdk/typescript/send.mdx b/sdk/typescript/send.mdx
index f3a04f8..40cdb90 100644
--- a/sdk/typescript/send.mdx
+++ b/sdk/typescript/send.mdx
@@ -45,6 +45,8 @@ When `input` is an `InputObject`, you have full control over the conversation:
 | `tools` | `Tool[]` | Array of function tools available to the model |
 | `tool_choice` | `ToolChoice` | Controls which tool (if any) the model should call. See [Tools documentation](/sdk/typescript/tools) for details |
 | `tags` | `string[]` | Optional tags to categorize and label the request for analytics and filtering. Can also be sent via the `x-edgee-tags` header (comma-separated) |
+| `enable_compression` | `bool` | Enable token compression for this request. If `true`, the request will be compressed to the compression rate specified in the API key settings. If `false`, the request will not be compressed. |
+| `compression_rate` | `number` | The compression rate to use for this request. If `enable_compression` is `true`, this value will be used to compress the request. The value should be between 0.0 and 1.0. The default value is 0.75. |
 
 **Example with InputObject:**
 
@@ -127,6 +129,7 @@ The `send()` method returns a `Promise<SendResponse>` with the following structu
 |----------|------|-------------|
 | `choices` | `Choice[]` | Array of completion choices (typically one) |
 | `usage` | `Usage \| undefined` | Token usage information (if provided by the API) |
+| `compression` | `Compression \| undefined` | Token compression metrics (if compression was applied) |
 
 ### Choice Object
 
@@ -169,7 +172,7 @@ Token usage information (when available):
 
 | Property | Type | Description |
 |----------|------|-------------|
-| `prompt_tokens` | `number` | Number of tokens in the prompt |
+| `prompt_tokens` | `number` | Number of tokens in the prompt (after compression if applied) |
 | `completion_tokens` | `number` | Number of tokens in the completion |
 | `total_tokens` | `number` | Total tokens used (prompt + completion) |
 
@@ -188,6 +191,41 @@ if (response.usage) {
 }
 ```
 
+### Compression Object
+
+Token compression metrics (when compression is applied):
+
+| Property | Type | Description |
+|----------|------|-------------|
+| `input_tokens` | `number` | Original number of input tokens before compression |
+| `saved_tokens` | `number` | Number of tokens saved by compression |
+| `rate` | `number` | Compression rate as a decimal (0-1). For example, `0.61` means 61% compression |
+
+**Example - Accessing Compression Metrics:**
+
+```typescript
+const response = await edgee.send({
+  model: 'gpt-4o',
+  input: {
+    messages: [
+      { role: 'user', content: 'Analyze this long document with lots of context...' }
+    ],
+    enable_compression: true,
+    compression_rate: 0.8
+  }
+});
+
+if (response.compression) {
+  console.log(`Original input tokens: ${response.compression.input_tokens}`);
+  console.log(`Tokens saved: ${response.compression.saved_tokens}`);
+  console.log(`Compression rate: ${(response.compression.rate * 100).toFixed(1)}%`);
+}
+```
+
+<Note>
+  The `compression` object is only present when token compression is applied to the request. Simple queries may not trigger compression.
+</Note>
+
 ## Convenience Properties
 
 The `SendResponse` class provides convenience getters for easier access:
diff --git a/sdk/typescript/stream.mdx b/sdk/typescript/stream.mdx
index 8c925de..5361802 100644
--- a/sdk/typescript/stream.mdx
+++ b/sdk/typescript/stream.mdx
@@ -45,6 +45,8 @@ When `input` is an `InputObject`, you have full control over the conversation:
 | `tools` | `Tool[]` | Array of function tools available to the model |
 | `tool_choice` | `ToolChoice` | Controls which tool (if any) the model should call. See [Tools documentation](/sdk/typescript/tools) for details |
 | `tags` | `string[]` | Optional tags to categorize and label the request for analytics and filtering. Can also be sent via the `x-edgee-tags` header (comma-separated) |
+| `enable_compression` | `bool` | Enable token compression for this request. If `true`, the request will be compressed to the compression rate specified in the API key settings. If `false`, the request will not be compressed. |
+| `compression_rate` | `number` | The compression rate to use for this request. If `enable_compression` is `true`, this value will be used to compress the request. The value should be between 0.0 and 1.0. The default value is 0.75. |
 
 For details about `Message` type, see the [Send Method documentation](/sdk/typescript/send#message-object). 
 For details about `Tool` and `ToolChoice` types, see the [Tools documentation](/sdk/typescript/tools).
@@ -76,6 +78,7 @@ Each chunk yielded by the generator has the following structure:
 | Property | Type | Description |
 |----------|------|-------------|
 | `choices` | `StreamChoice[]` | Array of streaming choices (typically one) |
+| `compression` | `Compression \| null` | Token compression metrics (if compression was applied) |
 
 ### StreamChoice Object