Cost Katana Backend 🥷

AI-powered cost intelligence that learns, predicts, and optimizes.

The backend powering Cost Katana—featuring intelligent monitoring, personalized coaching, predictive analytics, and seamless integrations.

🚀 Get Started in 5 Minutes

Step 1: Clone & Install

git clone https://github.com/cost-katana/costkatana-backend.git
cd costkatana-backend
npm install

Step 2: Configure Environment

cp .env.example .env

Edit .env with your credentials:

# Database
MONGODB_URI=mongodb://localhost:27017/cost-katana

# Security
JWT_SECRET=your-super-secure-jwt-secret-key-min-32-chars
ENCRYPTION_KEY=your-32-char-encryption-key-here!!

# AI Providers (at least one required)
OPENAI_API_KEY=sk-...           # For GPT models
GEMINI_API_KEY=...              # For Gemini models
AWS_ACCESS_KEY_ID=...           # For Claude via Bedrock
AWS_SECRET_ACCESS_KEY=...
AWS_REGION=us-east-1

Step 3: Run

npm run build
npm run dev

Server starts at http://localhost:8000 ✅

📚 Documentation

Prompt Caching Guide - Complete guide to true KV-pair caching
API Documentation - Full API reference
Cost Performance Tradeoffs - Architecture decisions

📖 What Makes This Special

🧠 AI Intelligence Engine

Feature	Description
User Profiling	Automatically learns usage patterns
Predictive Analytics	Forecasts spending and limit hits
Personalized Coaching	Tailored recommendations per user
Pattern Recognition	Detects inefficient usage automatically

🔗 Seamless Integrations

Integration	Description
ChatGPT Custom GPT	Direct tips inside ChatGPT
Magic Link Onboarding	1-click zero-friction setup
Multi-Provider Support	OpenAI, Gemini, Claude, Bedrock
Smart Routing	Auto-selects optimal provider

⚡ True Prompt Caching

Feature	Description
KV-Pair Caching	Cache LLM internal computations, not just responses
30-70% Cost Savings	Automatic optimization for repeated static content
Multi-Provider Support	Anthropic, OpenAI, Google Gemini implementations
Smart Structure Optimization	Automatically reorder prompts for maximum cache hits

📊 Real-Time Analytics

Feature	Description
Live Dashboards	Usage monitoring with AI insights
Cost Reports	AI-generated savings opportunities
Efficiency Scoring	AI-calculated scores with improvements
Cache Performance	Real-time prompt caching metrics and savings

🏗️ Architecture

┌─────────────────────────────────────────────────────────────┐
│                   Cost Katana Backend                        │
├─────────────────┬─────────────────┬─────────────────────────┤
│   🤖 AI Layer   │  🌐 API Layer   │     📊 Data Layer       │
├─────────────────┼─────────────────┼─────────────────────────┤
│ • AWS Bedrock   │ • Express.js    │ • MongoDB               │
│ • OpenAI SDK    │ • REST APIs     │ • Redis Cache           │
│ • Gemini SDK    │ • WebSockets    │ • Usage Analytics       │
│ • Forecasting   │ • Rate Limiting │ • User Profiles         │
└─────────────────┴─────────────────┴─────────────────────────┘

🔑 Environment Setup

Required Variables

# Database
MONGODB_URI=mongodb://localhost:27017/cost-katana

# Security
JWT_SECRET=your-super-secure-jwt-secret-key-min-32-chars
ENCRYPTION_KEY=your-32-char-encryption-key-here!!

# Frontend
FRONTEND_URL=http://localhost:3000

AI Provider Keys

⚠️ You must provide your own API keys. Cost Katana does not include OpenAI or Google keys.

# OpenAI (for GPT models)
OPENAI_API_KEY=sk-...

# Google (for Gemini models)
GEMINI_API_KEY=...

# AWS Bedrock (for Claude, Nova)
AWS_ACCESS_KEY_ID=...
AWS_SECRET_ACCESS_KEY=...
AWS_REGION=us-east-1
AWS_BEDROCK_MODEL_ID=anthropic.claude-3-sonnet-20240229-v1:0

Email (Optional)

SMTP_HOST=smtp.gmail.com
SMTP_PORT=587
SMTP_USER=your-email@gmail.com
SMTP_PASS=your-app-specific-password

🛣️ API Endpoints

AI Intelligence

POST /api/monitoring/analyze           # Trigger AI analysis
GET  /api/monitoring/status            # Usage status & predictions
GET  /api/monitoring/recommendations   # Personalized recommendations

ChatGPT Integration

POST /api/chatgpt/action               # Custom GPT actions
GET  /api/chatgpt/health               # Health check

Magic Link Onboarding

POST /api/onboarding/generate-magic-link   # Generate link
GET  /api/onboarding/verify/:token         # Verify & complete

Core Features

POST /api/usage/track                  # Track usage with AI analysis
GET  /api/projects                     # Projects with AI insights
GET  /api/analytics/intelligent        # AI-powered analytics

Prompt Caching

# Automatic prompt caching (enabled by default for supported providers)
POST /v1/chat/completions               # Claude, GPT, Gemini with KV caching
POST /v1/completions                   # OpenAI completions with caching

# Headers for prompt caching control
CostKatana-Prompt-Caching: true        # Enable prompt caching
CostKatana-Prompt-Caching: false       # Disable prompt caching

# Response headers with cache metrics
CostKatana-Prompt-Caching-Enabled: true
CostKatana-Prompt-Caching-Type: explicit
CostKatana-Prompt-Caching-Estimated-Savings: 0.0027

Telemetry & Observability

GET  /api/telemetry                    # Query telemetry data
GET  /api/telemetry/traces/:id         # Full trace details
GET  /api/telemetry/metrics            # Aggregated metrics
GET  /api/telemetry/dashboard          # Dashboard data
GET  /api/telemetry/dependencies       # Service dependency map
GET  /api/telemetry/health             # System health

📊 OpenTelemetry Integration

Quick Setup

# Install collector
npm run otel:install

# Configure in .env
OTEL_SERVICE_NAME=cost-katana-api
OTLP_HTTP_TRACES_URL=http://localhost:4318/v1/traces
OTLP_HTTP_METRICS_URL=http://localhost:4318/v1/metrics

# Start collector
npm run otel:run

# Verify
npm run telemetry:verify

Vendor Integrations

Grafana Cloud:

OTLP_HTTP_TRACES_URL=https://otlp-gateway.grafana.net/otlp/v1/traces
OTEL_EXPORTER_OTLP_HEADERS={"Authorization":"Basic base64_credentials"}

Datadog:

OTLP_HTTP_TRACES_URL=https://trace.agent.datadoghq.com:4318/v1/traces
OTEL_EXPORTER_OTLP_HEADERS={"DD-API-KEY":"your_api_key"}

New Relic:

OTLP_HTTP_TRACES_URL=https://otlp.nr-data.net:4318/v1/traces
OTEL_EXPORTER_OTLP_HEADERS={"Api-Key":"your_license_key"}

What You Get

Request Tracing — End-to-end visibility (API → DB → AI → Response)
Cost Attribution — Exact cost per request, model, and user
Performance Metrics — RPM, latency percentiles, error rates
Service Dependencies — Auto-generated dependency maps

🛠️ Development

Scripts

npm run dev          # Development server with hot reload
npm run build        # Build TypeScript
npm start            # Production server
npm test             # Run tests
npm run lint         # Check code style
npm run lint:fix     # Fix code style

Project Structure

src/
├── services/
│   ├── intelligentMonitoring.service.ts   # AI monitoring
│   ├── aiRouter.service.ts                # Provider routing
│   ├── providers/                         # Native SDKs
│   │   ├── openai.provider.ts
│   │   ├── gemini.provider.ts
│   │   └── base.provider.ts
│   ├── bedrock.service.ts                 # AWS Bedrock
│   └── email.service.ts                   # AI-enhanced emails
├── controllers/
│   ├── chatgpt.controller.ts              # ChatGPT integration
│   ├── onboarding.controller.ts           # Magic links
│   └── monitoring.controller.ts           # AI monitoring
├── models/
│   ├── User.ts
│   ├── Usage.ts
│   └── Project.ts
├── routes/
│   └── monitoring.routes.ts
└── utils/
    ├── cronJobs.ts
    └── logger.ts

🚨 Alert System

Threshold	Description
50%	Early warning with optimization tips
80%	Detailed analysis with cost-saving recommendations
90%	Urgent notification with immediate alternatives
Predictive	AI forecasts problems before they occur

🐳 Docker Deployment

FROM node:18-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build
EXPOSE 8000
CMD ["node", "dist/server.js"]

# Production environment
NODE_ENV=production
MONGODB_URI=mongodb://your-production-db
FRONTEND_URL=https://costkatana.com

🔒 Security

Feature	Description
JWT + Refresh Tokens	Secure session management
Role-Based Access	Different features per user type
API Key Encryption	Secure storage of user keys
Rate Limiting	Prevents abuse
Data Minimization	Only necessary data sent to AI
Audit Logging	Comprehensive interaction logs

🧪 Testing

# Test AI monitoring
curl -X POST "http://localhost:8000/api/monitoring/analyze" \
  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"userId": "your-user-id"}'

# Test ChatGPT integration
curl -X POST "http://localhost:8000/api/chatgpt/action" \
  -H "Content-Type: application/json" \
  -d '{
    "action": "track_usage",
    "conversation_data": {
      "prompt": "Help me debug this React component",
      "response": "Here is how to debug...",
      "model": "gpt-4"
    }
  }'

# Test magic link
curl -X POST "http://localhost:8000/api/onboarding/generate-magic-link" \
  -H "Content-Type: application/json" \
  -d '{"email": "test@example.com", "name": "Test User"}'

🔗 Google Integration

Cost Katana integrates with Google Workspace using non-sensitive OAuth scopes only, avoiding the need for expensive CASA security assessments.

OAuth Scopes Used

// Non-sensitive scopes (No CASA required)
const SCOPES = [
  'openid',
  'https://www.googleapis.com/auth/userinfo.email',
  'https://www.googleapis.com/auth/userinfo.profile',
  'https://www.googleapis.com/auth/drive.file'  // Access only to files created by or selected via picker
];

What You Can Do

Feature	Description	Status
File Picker	Select any Docs/Sheets from Drive	✅ Available
Create Documents	Create new Google Docs	✅ Available
Create Spreadsheets	Create new Google Sheets	✅ Available
Export Cost Data	Export to new Sheets	✅ Available
Access Selected Files	Read/write files selected via picker	✅ Available
Chat Integration	Use @docs and @sheets mentions	✅ Available

Setup

Create Google OAuth 2.0 Credentials:
- Go to Google Cloud Console
- Create a new project or select existing
- Enable Google Drive API
- Create OAuth 2.0 credentials
- Add authorized redirect URIs
Configure Environment Variables:

# Google OAuth
GOOGLE_CLIENT_ID=your-client-id.apps.googleusercontent.com
GOOGLE_CLIENT_SECRET=your-client-secret
GOOGLE_REDIRECT_URI=http://localhost:3000/oauth/callback/google

# Google API (for File Picker)
GOOGLE_API_KEY=your-google-api-key

   # Grounding Confidence Layer (GCL) - Hallucination Prevention
   # Core Principle: "Generation is a privilege, not a default"
   
   # Phase 1: Shadow Mode (Observe only, no blocking)
   # ENABLE_GCL_SHADOW=true
   # ENABLE_GCL_BLOCKING=false
   
   # Phase 2: Partial Blocking (ASK_CLARIFY enforced, REFUSE still shadow) ← CURRENT
   ENABLE_GCL_SHADOW=false
   ENABLE_GCL_BLOCKING=true
   
   # Phase 3: Full Enforcement (All decisions enforced)
   # ENABLE_GCL_STRICT_REFUSAL=false  # Set to true for Phase 3
   
   # Always enabled
   ENABLE_GCL_LOGGING=true             # Always log for metrics
   ENABLE_GCL_EMERGENCY_BYPASS=false   # Emergency override (use only in incidents)

Grounding Confidence Layer (GCL)

The GCL is a pre-generation decision gate that prevents hallucinations by evaluating whether the system has sufficient, fresh, and relevant information before generating responses.

Core Principle: "Generation is a privilege, not a default."

Rollout Phases:

Phase 1 (Shadow): Logs decisions without affecting UX - safe to enable in production
Phase 2 (Partial Blocking): Enables clarification requests only
Phase 3 (Full Blocking): Enforces all decisions including refusals

Configuration Flags:

ENABLE_GCL_SHADOW: Enable shadow mode (observe without blocking)
ENABLE_GCL_BLOCKING: Enable decision enforcement
ENABLE_GCL_STRICT_REFUSAL: Enable REFUSE decision enforcement
ENABLE_GCL_LOGGING: Enable decision logging for metrics
ENABLE_GCL_EMERGENCY_BYPASS: Emergency override (page on-call if used)

Safeguards:

✅ Loop breakers (max 2 clarifications, max 2 search retries)
✅ Decision stickiness (prevents retry gaming)
✅ Context drift detection (prevents hallucinations on topic shifts)
✅ Domain risk multipliers (stricter thresholds for finance/security/legal)
✅ Memory write gating (prevents poisoning long-term memory)
✅ Redundant safety checks (fail-safe if GCL bypassed)

Configure Authorized Origins (for Picker):
- Add http://localhost:3000 to Authorized JavaScript origins
- Add http://localhost:8000 to Authorized JavaScript origins

API Endpoints

// OAuth Flow
POST   /api/oauth/initiate/google          // Start OAuth flow
GET    /api/oauth/callback/google          // OAuth callback
POST   /api/oauth/refresh                  // Refresh access token

// File Picker
GET    /api/google/picker/token            // Get picker token
POST   /api/google/picker/cache-selection  // Cache selected files
GET    /api/google/accessible-files        // List accessible files

// Documents & Sheets
POST   /api/google/docs/create             // Create new Doc
POST   /api/google/sheets/create           // Create new Sheet
GET    /api/google/docs/list               // List accessible Docs
GET    /api/google/sheets/list             // List accessible Sheets

// Export
POST   /api/google/export/cost-data        // Export cost data to Sheets
POST   /api/google/export/cost-report      // Create cost report in Docs

Frontend Integration

import { useGooglePicker } from '@/hooks/useGooglePicker';

// In your component
const { openPicker, selectedFiles } = useGooglePicker({
  viewType: 'DOCS',  // or 'SPREADSHEETS', 'DOCS_IMAGES_AND_VIDEOS'
  multiselect: true,
  callback: (data) => {
    console.log('Files selected:', data.docs);
  }
});

// Open picker
<button onClick={() => openPicker(connectionId)}>
  Select Files from Drive
</button>

File Access Model

With drive.file scope, the app can only access:

Files created by the app (e.g., exported cost reports)
Files explicitly selected by user via File Picker

This ensures maximum privacy while providing full functionality.

No CASA Assessment Needed

By using only non-sensitive scopes, Cost Katana avoids:

✅ $15,000 - $75,000+ CASA assessment fees
✅ Lengthy security review process
✅ Complex compliance requirements
✅ Annual re-assessments

📞 Support

Channel	Link
Email	support@costkatana.com
GitHub Issues	github.com/cost-katana/costkatana-backend/issues
Discord	discord.gg/D8nDArmKbY

📄 License

Transform your AI cost management with intelligent coaching 🥷

Name		Name	Last commit message	Last commit date
Latest commit History 516 Commits
.github/workflows		.github/workflows
config		config
docs		docs
examples		examples
knowledge-base		knowledge-base
ops		ops
scripts		scripts
src		src
templates		templates
tests		tests
.dockerignore		.dockerignore
.eslintrc.js		.eslintrc.js
.gitignore		.gitignore
API_DOCUMENTATION.md		API_DOCUMENTATION.md
Dockerfile		Dockerfile
README.md		README.md
chatgpt-custom-gpt-config.json		chatgpt-custom-gpt-config.json
env.example		env.example
gateway-demo.http		gateway-demo.http
jest.config.js		jest.config.js
mcp-mongodb-config.json		mcp-mongodb-config.json
nodemon.json		nodemon.json
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
vector-index.json		vector-index.json

Hypothesize-Tech/costkatana-backend

Folders and files

Latest commit

History

Repository files navigation