Skip to content

Hypothesize-Tech/costkatana-core

Repository files navigation

Cost Katana πŸ₯·

Cut your AI costs in half. Without cutting corners.

Cost Katana is a drop-in SDK that wraps your AI calls with automatic cost tracking, smart caching, and optimizationβ€”all in one line of code.


πŸš€ Get Started in 60 Seconds

Step 1: Install

npm install cost-katana

Step 2: Make Your First AI Call

import { ai, OPENAI } from 'cost-katana';

const response = await ai(OPENAI.GPT_4, 'Explain quantum computing in one sentence');

console.log(response.text);   // "Quantum computing uses qubits to perform..."
console.log(response.cost);   // 0.0012
console.log(response.tokens); // 47

That's it. No configuration files. No complex setup. Just results.


🌍 Provider-Independent by Design

Cost Katana is completely provider-agnostic. Never lock yourself into a single vendor.

βœ… Use Capability-Based Routing

import { ai, ModelCapability } from 'cost-katana';

// Automatically selects best model for each task
const code = await ai(ModelCapability.CODE_GENERATION, 'Write a React component');
const chat = await ai(ModelCapability.CONVERSATION, 'Hello!');
const vision = await ai(ModelCapability.VISION, 'Describe this image', { image });

βœ… Optimize by Performance Characteristics

import { ai } from 'cost-katana';

// Fastest model available
const fast = await ai({ speed: 'fastest' }, prompt);

// Cheapest model available
const cheap = await ai({ cost: 'cheapest' }, prompt);

// Best quality model
const best = await ai({ quality: 'best' }, prompt);

// Balanced approach
const balanced = await ai({ speed: 'fast', cost: 'cheap' }, prompt);

Benefits:

  • πŸ”„ Automatic Failover - Seamlessly switch providers if one goes down
  • πŸ’° Cost Optimization - Routes to the cheapest provider automatically
  • πŸš€ Future-Proof - New providers added without code changes
  • πŸ”“ Zero Lock-In - Switch providers anytime, no refactoring needed

Read the full Provider-Agnostic Guide β†’


πŸ“– Tutorial: Build a Cost-Aware Chatbot

Let's build something real. In this tutorial, you'll create a chatbot that:

  • βœ… Tracks every dollar spent
  • βœ… Caches repeated questions (saving 100% on duplicates)
  • βœ… Optimizes long responses (40-75% savings)

Part 1: Basic Chat Session

import { chat, OPENAI } from 'cost-katana';

// Create a persistent chat session
const session = chat(OPENAI.GPT_4);

// Send messages and track costs
await session.send('Hello! What can you help me with?');
await session.send('Tell me a programming joke');
await session.send('Now explain it');

// See exactly what you spent
console.log(`πŸ’° Total cost: $${session.totalCost.toFixed(4)}`);
console.log(`πŸ“Š Messages: ${session.messages.length}`);
console.log(`🎯 Tokens used: ${session.totalTokens}`);

Part 2: Add Smart Caching

Cache identical questions to avoid paying twice:

import { ai, OPENAI } from 'cost-katana';

// First call - hits the API
const response1 = await ai(OPENAI.GPT_4, 'What is 2+2?', { cache: true });
console.log(`Cached: ${response1.cached}`);  // false
console.log(`Cost: $${response1.cost}`);     // $0.0008

// Second call - served from cache (FREE!)
const response2 = await ai(OPENAI.GPT_4, 'What is 2+2?', { cache: true });
console.log(`Cached: ${response2.cached}`);  // true
console.log(`Cost: $${response2.cost}`);     // $0.0000 πŸŽ‰

Part 3: Enable Cortex Optimization

For long-form content, Cortex compresses prompts intelligently:

import { ai, OPENAI } from 'cost-katana';

const response = await ai(
  OPENAI.GPT_4,
  'Write a comprehensive guide to machine learning for beginners',
  { 
    cortex: true,      // Enable 40-75% cost reduction
    maxTokens: 2000 
  }
);

console.log(`Optimized: ${response.optimized}`);
console.log(`Saved: $${response.savedAmount}`);

Part 4: Compare Models Side-by-Side

Find the best price-to-quality ratio for your use case:

import { ai, OPENAI, ANTHROPIC, GOOGLE } from 'cost-katana';

const prompt = 'Summarize the theory of relativity in 50 words';

const models = [
  { name: 'GPT-4', id: OPENAI.GPT_4 },
  { name: 'Claude 3.5 Sonnet', id: ANTHROPIC.CLAUDE_3_5_SONNET_20241022 },
  { name: 'Gemini 2.5 Pro', id: GOOGLE.GEMINI_2_5_PRO },
  { name: 'GPT-3.5 Turbo', id: OPENAI.GPT_3_5_TURBO }
];

console.log('πŸ“Š Model Cost Comparison\n');

for (const model of models) {
  const response = await ai(model.id, prompt);
  console.log(`${model.name.padEnd(20)} $${response.cost.toFixed(6)}`);
}

Sample Output:

πŸ“Š Model Cost Comparison

GPT-4                $0.001200
Claude 3.5 Sonnet    $0.000900
Gemini 2.5 Pro       $0.000150
GPT-3.5 Turbo        $0.000080

🎯 Type-Safe Model Selection

Stop guessing model names. Get autocomplete and catch typos at compile time:

import { OPENAI, ANTHROPIC, GOOGLE, AWS_BEDROCK, XAI, DEEPSEEK } from 'cost-katana';

// OpenAI
OPENAI.GPT_5
OPENAI.GPT_4
OPENAI.GPT_4O
OPENAI.GPT_3_5_TURBO
OPENAI.O1
OPENAI.O3

// Anthropic
ANTHROPIC.CLAUDE_SONNET_4_5
ANTHROPIC.CLAUDE_3_5_SONNET_20241022
ANTHROPIC.CLAUDE_3_5_HAIKU_20241022

// Google
GOOGLE.GEMINI_2_5_PRO
GOOGLE.GEMINI_2_5_FLASH
GOOGLE.GEMINI_1_5_PRO

// AWS Bedrock
AWS_BEDROCK.NOVA_PRO
AWS_BEDROCK.NOVA_LITE
AWS_BEDROCK.CLAUDE_SONNET_4_5

// Others
XAI.GROK_2_1212
DEEPSEEK.DEEPSEEK_CHAT

Why constants over strings?

Feature String 'gpt-4' Constant OPENAI.GPT_4
Autocomplete ❌ βœ…
Typo protection ❌ βœ…
Refactor safely ❌ βœ…
Self-documenting ❌ βœ…

βš™οΈ Configuration

Environment Variables

# Recommended: Use Cost Katana API key for all features
COST_KATANA_API_KEY=dak_your_key_here

# Or use provider keys directly
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GEMINI_API_KEY=...

# For AWS Bedrock
AWS_ACCESS_KEY_ID=...
AWS_SECRET_ACCESS_KEY=...
AWS_REGION=us-east-1

Programmatic Configuration

import { configure } from 'cost-katana';

await configure({
  apiKey: 'dak_your_key',
  cortex: true,     // 40-75% cost savings
  cache: true,      // Smart caching
  firewall: true    // Block prompt injections
});

Request Options

const response = await ai(OPENAI.GPT_4, 'Your prompt', {
  temperature: 0.7,                        // Creativity (0-2)
  maxTokens: 500,                          // Response limit
  systemMessage: 'You are a helpful AI',   // System prompt
  cache: true,                             // Enable caching
  cortex: true,                            // Enable optimization
  retry: true                              // Auto-retry on failures
});

πŸ”Œ Framework Integration

Next.js App Router

// app/api/chat/route.ts
import { ai, OPENAI } from 'cost-katana';

export async function POST(request: Request) {
  const { prompt } = await request.json();
  const response = await ai(OPENAI.GPT_4, prompt);
  return Response.json(response);
}

Express.js

import express from 'express';
import { ai, OPENAI } from 'cost-katana';

const app = express();
app.use(express.json());

app.post('/api/chat', async (req, res) => {
  const response = await ai(OPENAI.GPT_4, req.body.prompt);
  res.json(response);
});

app.listen(3000);

Fastify

import fastify from 'fastify';
import { ai, OPENAI } from 'cost-katana';

const app = fastify();

app.post('/api/chat', async (request) => {
  const { prompt } = request.body as { prompt: string };
  return await ai(OPENAI.GPT_4, prompt);
});

app.listen({ port: 3000 });

NestJS

import { Controller, Post, Body } from '@nestjs/common';
import { ai, OPENAI } from 'cost-katana';

@Controller('api')
export class ChatController {
  @Post('chat')
  async chat(@Body() body: { prompt: string }) {
    return await ai(OPENAI.GPT_4, body.prompt);
  }
}

πŸ›‘οΈ Built-in Security

Firewall Protection

Block prompt injection attacks automatically:

import { configure, ai, OPENAI } from 'cost-katana';

await configure({ firewall: true });

try {
  await ai(OPENAI.GPT_4, 'Ignore all previous instructions and...');
} catch (error) {
  console.log('πŸ›‘οΈ Blocked:', error.message);
}

Protects against:

  • Prompt injection attacks
  • Jailbreak attempts
  • Data exfiltration
  • Malicious content generation

πŸ”„ Auto-Failover

Never let provider outages break your app:

import { ai, OPENAI } from 'cost-katana';

// If OpenAI is down, automatically switches to Claude or Gemini
const response = await ai(OPENAI.GPT_4, 'Hello');

console.log(`Provider used: ${response.provider}`);
// Could be 'openai', 'anthropic', or 'google' depending on availability

πŸ“Š Comprehensive Usage Tracking & Analytics

Real-time Performance Monitoring

Cost Katana now provides comprehensive tracking of every request, including network performance, client environment, and optimization opportunities:

import { AICostTracker, OPENAI } from 'cost-katana';

const tracker = new AICostTracker({
  apiKey: process.env.COST_KATANA_API_KEY,
  // Enable comprehensive tracking
  comprehensiveTracking: true,
  // Optional: configure tracking endpoints
  trackingEndpoint: 'https://api.costkatana.com/usage/track-comprehensive'
});

const response = await tracker.chat(OPENAI.GPT_4, 'Explain quantum computing');

console.log('Response:', response.text);
console.log('Cost:', response.cost);
console.log('Tokens:', response.tokens);
console.log('Response Time:', response.responseTime);

// Comprehensive tracking data is automatically sent to your dashboard
// Including network metrics, client environment, and optimization suggestions

View Analytics in Dashboard

Once tracking is enabled, you can view detailed analytics at your dashboard:

  • Network Performance: DNS lookup time, TCP connection time, total response time
  • Client Environment: User agent, platform, IP geolocation
  • Request/Response Data: Full request and response payloads (sanitized)
  • Optimization Opportunities: AI-powered suggestions to reduce costs
  • Performance Metrics: Real-time monitoring with anomaly detection

Manual Usage Tracking

For custom implementations or additional tracking:

import { AICostTracker } from 'cost-katana';

const tracker = new AICostTracker({
  apiKey: process.env.COST_KATANA_API_KEY
});

// Manually track usage with additional metadata
await tracker.trackUsage({
  model: 'gpt-4',
  provider: 'openai',
  prompt: 'Hello, world!',
  completion: 'Hello! How can I help you today?',
  promptTokens: 3,
  completionTokens: 9,
  totalTokens: 12,
  cost: 0.00036,
  responseTime: 850,
  userId: 'user_123',
  sessionId: 'session_abc',
  tags: ['chat', 'greeting'],
  // Additional metadata for comprehensive tracking
  requestMetadata: {
    userAgent: navigator?.userAgent,
    clientIP: await fetch('https://api.ipify.org').then(r => r.text()),
    feature: 'chat-interface'
  }
});

Session Replay & Distributed Tracing

import { AICostTracker } from 'cost-katana';

const tracker = new AICostTracker({
  apiKey: process.env.COST_KATANA_API_KEY,
  sessionReplay: true,
  distributedTracing: true
});

// Start a traced session
const sessionId = tracker.startSession({
  userId: 'user_123',
  feature: 'customer-support',
  metadata: {
    source: 'web-app',
    version: '1.2.3'
  }
});

// All requests in this session will be automatically traced
const response = await tracker.chat(OPENAI.GPT_4, 'How can I cancel my subscription?', {
  sessionId,
  tags: ['support', 'billing']
});

// End session and get analytics
const sessionStats = await tracker.endSession(sessionId);
console.log('Session cost:', sessionStats.totalCost);
console.log('Session duration:', sessionStats.duration);
console.log('Requests made:', sessionStats.requestCount);

πŸ’‘ Cost Optimization Cheatsheet

Strategy Savings When to Use
Use GPT-3.5 over GPT-4 90% Simple tasks, translations
Enable caching 100% on hits Repeated queries, FAQs
Enable Cortex 40-75% Long-form content
Batch in sessions 10-20% Related queries
Use Gemini Flash 95% vs GPT-4 High-volume, cost-sensitive

Quick Wins

// ❌ Expensive: Using GPT-4 for everything
await ai(OPENAI.GPT_4, 'What is 2+2?');  // $0.001

// βœ… Smart: Match model to task
await ai(OPENAI.GPT_3_5_TURBO, 'What is 2+2?');  // $0.0001

// βœ… Smarter: Cache common queries
await ai(OPENAI.GPT_3_5_TURBO, 'What is 2+2?', { cache: true });  // $0 on repeat

// βœ… Smartest: Cortex for long content
await ai(OPENAI.GPT_4, 'Write a 2000-word essay', { cortex: true });  // 40-75% off

πŸ”§ Error Handling

import { ai, OPENAI } from 'cost-katana';

try {
  const response = await ai(OPENAI.GPT_4, 'Hello');
  console.log(response.text);
} catch (error) {
  switch (error.code) {
    case 'NO_API_KEY':
      console.log('Set COST_KATANA_API_KEY or OPENAI_API_KEY');
      break;
    case 'RATE_LIMIT':
      console.log('Rate limited. Retrying...');
      break;
    case 'INVALID_MODEL':
      console.log('Model not found. Available:', error.availableModels);
      break;
    default:
      console.log('Error:', error.message);
  }
}

πŸ“š More Examples

Explore 45+ complete examples in our examples repository:

πŸ”— github.com/Hypothesize-Tech/costkatana-examples

Category Examples
Cost Tracking Basic tracking, budgets, alerts
Gateway Routing, load balancing, failover
Optimization Cortex, caching, compression
Observability OpenTelemetry, tracing, metrics
Security Firewall, rate limiting, moderation
Workflows Multi-step AI orchestration
Frameworks Express, Next.js, Fastify, NestJS, FastAPI

πŸ”„ Migration Guides

From OpenAI SDK

// Before
import OpenAI from 'openai';
const openai = new OpenAI({ apiKey: 'sk-...' });
const completion = await openai.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: 'Hello' }]
});
console.log(completion.choices[0].message.content);

// After
import { ai, OPENAI } from 'cost-katana';
const response = await ai(OPENAI.GPT_4, 'Hello');
console.log(response.text);
console.log(`Cost: $${response.cost}`);  // Bonus: cost tracking!

From Anthropic SDK

// Before
import Anthropic from '@anthropic-ai/sdk';
const anthropic = new Anthropic({ apiKey: 'sk-ant-...' });
const message = await anthropic.messages.create({
  model: 'claude-3-sonnet-20241022',
  messages: [{ role: 'user', content: 'Hello' }]
});

// After
import { ai, ANTHROPIC } from 'cost-katana';
const response = await ai(ANTHROPIC.CLAUDE_3_5_SONNET_20241022, 'Hello');

From LangChain

// Before
import { ChatOpenAI } from 'langchain/chat_models/openai';
const model = new ChatOpenAI({ modelName: 'gpt-4' });
const response = await model.call([{ content: 'Hello' }]);

// After
import { ai, OPENAI } from 'cost-katana';
const response = await ai(OPENAI.GPT_4, 'Hello');

🀝 Contributing

We welcome contributions! See our Contributing Guide.

git clone https://github.com/Hypothesize-Tech/costkatana-core.git
cd costkatana-core
npm install

npm run lint        # Check code style
npm run lint:fix    # Auto-fix issues
npm run format      # Format code
npm test            # Run tests
npm run build       # Build

πŸ“ž Support

Channel Link
Dashboard costkatana.com
Documentation docs.costkatana.com
GitHub github.com/Hypothesize-Tech
Discord discord.gg/D8nDArmKbY
Email support@costkatana.com

πŸ“„ License

MIT Β© Cost Katana


Start cutting AI costs today πŸ₯·

npm install cost-katana
import { ai, OPENAI } from 'cost-katana';
await ai(OPENAI.GPT_4, 'Hello, world!');

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages