Call Mind: AI Telephony Based Agent Platform

A comprehensive, production-ready platform for creating, managing, and deploying AI-powered voice agents with support for multiple LLM providers, real-time voice synthesis, knowledge base integration, and intelligent call routing.

Quick Start

Prerequisites

Python 3.8+
MongoDB (local or cloud)
Redis (for call queuing and caching)
API Keys for:
- OpenAI
- Google Gemini
- Anthropic Claude
- ElevenLabs
- Deepgram
- Twilio
- Serper API

Installation

git clone https://github.com/ajitashwath/callmind.git
cd call-mind

python -m venv venv
source venv/bin/activate

pip install -r requirements.txt

cp .env.example .env

Environment Configuration

Create a .env file in the project root with the following variables:

# Application
APP_NAME=AI Agent Platform
DEBUG=false
SECRET_KEY=your-secret-key-here
FASTAPI_PORT=3000

# Database
MONGODB_URL=mongodb://localhost:27017/ai_agents
REDIS_URL=redis://localhost:6379
REDIS_HOST=localhost
REDIS_PORT=6379
REDIS_PASSWORD=

# LLM Providers
OPENAI_API_KEY=sk-...
GEMINI_API_KEY=AIzaSy...
CLAUDE_API_KEY=sk-ant-...

# Voice Providers
ELEVENLABS_API_KEY=...
OPENAI_API_KEY=sk-...
CARTESIA_API_KEY=...

# Deepgram (Speech-to-Text & Pricing)
DEEPGRAM_API_KEY=...

# Telephony
TWILIO_ACCOUNT_SID=AC...
TWILIO_AUTH_TOKEN=...
WEBHOOK_BASE_URL=https://your-domain.com/agent

# ChromaDB (Vector Storage)
CHROMA_CLOUD_API_KEY=  # Optional - uses local by default
CHROMA_CLOUD_TENANT=
CHROMA_CLOUD_DATABASE=
CHROMA_PERSIST_DIR=./chroma_data

# JWT Authentication
JWT_ACCESS_SECRET=your-access-secret
JWT_REFRESH_SECRET=your-refresh-secret
JWT_ALGORITHM=HS256
JWT_ISSUER=jesty-crm
JWT_AUDIENCE=jesty-crm-users
BACKEND_API_URL=http://localhost:3000

# Search API
SERPER_API_KEY=...

Running the Application

python app/main.py

# Production
uvicorn app.main:app --host 0.0.0.0 --port 3000 --workers 4

The API will be available at http://localhost:3000 with interactive documentation at /agent/docs.

Architecture Overview

AI Agent Platform
├── Authentication & Authorization
├── Agent Management
│   ├── Multi-LLM Support (OpenAI, Gemini, Claude)
│   ├── Voice Integration
│   └── Template Management
├── Conversation Management
│   ├── Message Tracking
│   ├── Cost Calculation
│   └── Summarization
├── Knowledge Base (RAG)
│   ├── Multi-format File Support
│   ├── Semantic Search
│   └── ChromaDB Vector Storage
├── Voice Services
│   ├── Text-to-Speech (ElevenLabs, OpenAI, Cartesia)
│   ├── Voice Cloning
│   └── Voice Management
├── Telephony System
│   ├── Twilio Integration
│   ├── Deepgram Real-time Transcription
│   ├── Call Routing & Queuing
│   └── WebSocket Streaming
└── Dashboard & Analytics
    ├── Call Metrics
    ├── Cost Tracking
    ├── Performance Analysis
    └── Agent Performance

Core Modules

1. Agents (`app/agents/`)

Create and manage AI agents with customizable configurations.

Key Features:

Multi-LLM provider support
Industry-specific templates
Voice configuration
Knowledge base integration
Auto-shift scheduling

Main Endpoints:

POST /api/agents - Create agent
GET /api/agents - List agents
PUT /api/agents/{agent_id} - Update agent
POST /api/agents/{agent_id}/test - Test agent
GET /api/agents/templates/industries - Browse templates

Example - Create an Agent:

curl -X POST http://localhost:3000/api/agents \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Sales Assistant",
    "description": "AI-powered sales representative",
    "config": {
      "llm_provider": "openai",
      "model": "gpt-4o-mini",
      "temperature": 0.7,
      "max_tokens": 1000,
      "system_prompt": "You are a friendly sales representative...",
      "first_message": "Hello! How can I help you today?",
      "voice_provider": "elevenlabs",
      "voice_id": "rachel",
      "max_conversation_turns": 10
    }
  }'

2. Conversations (`app/conversations/`)

Manage conversation lifecycle, messages, and analytics.

Key Features:

Session-based conversation management
Message and event tracking
Multi-provider cost calculation
AI-powered summarization
Advanced filtering

Main Endpoints:

POST /api/conversations - Create conversation
GET /api/conversations - List conversations
GET /api/conversations/{id} - Get conversation details
GET /api/conversations/{id}/summary - Get AI summary
PATCH /api/conversations/{id}/metrics - Update metrics

Example - Create Conversation:

curl -X POST http://localhost:3000/api/conversations \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "agent_id": "agent_abc123",
    "session_id": "session_xyz789",
    "from_number": "+1234567890",
    "to_number": "+0987654321"
  }'

3. Knowledge Base (`app/knowledge/`)

RAG-powered knowledge management with semantic search.

Supported Formats:

PDF documents
Word documents (.docx, .doc)
Excel spreadsheets (.xlsx, .xls)
CSV files
Plain text files
Website content (URLs)

Main Endpoints:

POST /api/knowledge/create - Create knowledge base
GET /api/knowledge/ - List knowledge bases
POST /api/knowledge/{kb_id}/add-text - Add text
POST /api/knowledge/{kb_id}/add-file - Upload file
POST /api/knowledge/{kb_id}/add-website - Add website
GET /api/knowledge/{kb_id}/search - Search knowledge base

Example - Create & Search Knowledge Base:

# Create knowledge base
KB_ID=$(curl -X POST http://localhost:3000/api/knowledge/create \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{"name":"Company Docs","description":"Internal knowledge"}' \
  | jq -r '.kb_id')

# Add document
curl -X POST http://localhost:3000/api/knowledge/$KB_ID/add-text \
  -H "Authorization: Bearer <token>" \
  -F "content=Our business hours are 9 AM to 5 PM EST"

# Search
curl -X GET "http://localhost:3000/api/knowledge/$KB_ID/search?query=business%20hours&top_k=5" \
  -H "Authorization: Bearer <token>"

4. Voice Services (`app/voice/`)

Multi-provider text-to-speech and voice cloning.

Supported Providers:

ElevenLabs (advanced TTS with cloning)
OpenAI TTS (standard and HD)
Cartesia (Sonic models)

Main Endpoints:

POST /api/voices/synthesize - Convert text to speech
POST /api/voices/clone - Create cloned voice
GET /api/voices - List voices
GET /api/voices/search - Search voices
PUT /api/voices/{voice_id}/settings - Update settings
GET /api/voices/test - Test provider connections

Example - Synthesize Speech:

curl -X POST http://localhost:3000/api/voices/synthesize \
  -H "Authorization: Bearer <token>" \
  -F "text=Hello, this is a test" \
  -F "voice_id=rachel" \
  -F "provider=elevenlabs" \
  -F "stability=0.7" \
  -F "similarity_boost=0.8" \
  --output speech.mp3

5. Telephony (`app/telephony/`)

Complete voice call management with real-time transcription.

Key Features:

Twilio integration for call routing
Deepgram real-time transcription
WebSocket streaming
Call queuing with scheduling
Metrics tracking and cost calculation

Main Endpoints:

POST /api/telephony/calls/outbound - Make outbound call
POST /api/telephony/calls/outbound/streaming - Stream call
GET /api/telephony/calls/{call_sid} - Get call status
POST /api/telephony/calls/{call_sid}/hangup - End call
WS /api/telephony/twilio/stream/{agent_id} - WebSocket stream

Example - Make Outbound Call:

curl -X POST http://localhost:3000/api/telephony/calls/outbound \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "agent_id": "agent_abc123",
    "to_number": "+1234567890",
    "from_number": "+0987654321"
  }'

6. Dashboard (`app/dashboard/`)

Comprehensive analytics and performance monitoring.

Main Endpoints:

GET /api/dashboard/stats - Aggregate statistics
GET /api/dashboard/calls/analytics - Call details
GET /api/dashboard/agents/{agent_id}/metrics - Agent metrics
GET /api/dashboard/costs/breakdown - Cost analysis
GET /api/dashboard/performance/trends - Performance trends
GET /api/dashboard/calls/{call_id}/summary - Call summary

Query Parameters (most endpoints):

time_range: 24h, 7d, 30d (default), 90d, custom
from_date, to_date: ISO format dates for custom range
agent_ids: Comma-separated agent IDs
call_status: Filter by status
min_duration, max_duration: Duration filters
min_rating: Minimum satisfaction rating

Example - Get Dashboard Stats:

curl "http://localhost:3000/api/dashboard/stats?time_range=7d" \
  -H "Authorization: Bearer <token>"

🌐 API Endpoints

Authentication

POST   /api/auth/login              - Login and get tokens
POST   /api/auth/refresh            - Refresh access token
POST   /api/auth/logout             - Logout
GET    /api/auth/me                 - Get current user info

Agents

GET    /api/agents                  - List agents
POST   /api/agents                  - Create agent
GET    /api/agents/{agent_id}       - Get agent details
PUT    /api/agents/{agent_id}       - Update agent
DELETE /api/agents/{agent_id}       - Delete agent
POST   /api/agents/{agent_id}/test  - Test agent
GET    /api/agents/{agent_id}/start - Start conversation
GET    /api/agents/templates/*      - Template management
GET    /api/agents/models           - Available models
GET    /api/agents/models/pricing   - Model pricing

Conversations

GET    /api/conversations           - List conversations
POST   /api/conversations           - Create conversation
GET    /api/conversations/{id}      - Get conversation
GET    /api/conversations/{id}/summary
GET    /api/conversations/{id}/metadata
GET    /api/conversations/{id}/stats
PATCH  /api/conversations/{id}/metrics
POST   /api/conversations/{id}/calculate-costs
POST   /api/conversations/{id}/events

Knowledge Base

POST   /api/knowledge/create        - Create KB
GET    /api/knowledge/              - List KBs
DELETE /api/knowledge/{kb_id}       - Delete KB
POST   /api/knowledge/{kb_id}/add-text
POST   /api/knowledge/{kb_id}/add-file
POST   /api/knowledge/{kb_id}/add-website
GET    /api/knowledge/{kb_id}/search
POST   /api/knowledge/{kb_id}/associate-agents
GET    /api/knowledge/{kb_id}/agents

Voice Services

GET    /api/voices                  - List voices
GET    /api/voices/search           - Search voices
GET    /api/voices/trending         - Trending voices
GET    /api/voices/{voice_id}       - Get voice details
POST   /api/voices/synthesize       - Text to speech
POST   /api/voices/clone            - Clone voice
DELETE /api/voices/{voice_id}       - Delete voice
GET    /api/voices/test             - Test providers

Telephony

POST   /api/telephony/calls/outbound
POST   /api/telephony/calls/outbound/streaming
GET    /api/telephony/calls/{call_sid}
POST   /api/telephony/calls/{call_sid}/hangup
GET    /api/telephony/calls/{call_sid}/status
GET    /api/telephony/phone-numbers/available
POST   /api/telephony/phone-numbers/buy
WS     /api/telephony/twilio/stream/{agent_id}

Dashboard

GET    /api/dashboard/stats
GET    /api/dashboard/calls/analytics
GET    /api/dashboard/agents/{agent_id}/metrics
GET    /api/dashboard/costs/breakdown
GET    /api/dashboard/performance/trends
GET    /api/dashboard/calls/{call_id}/summary
GET    /api/dashboard/calls/summary

System

GET    /                            - API info
GET    /health                      - Health check
GET    /config                      - Configuration info

Features

Multi-LLM Support

Seamlessly switch between multiple AI providers:

OpenAI: GPT-4, GPT-4 Turbo, GPT-3.5 Turbo
Google Gemini: Gemini 1.5 Pro, Gemini 1.5 Flash
Anthropic Claude: Claude 3 Sonnet, Claude 3 Opus

Voice Integration

Support for leading voice synthesis providers:

ElevenLabs: Professional TTS with voice cloning
OpenAI: Fast, reliable TTS
Cartesia: Advanced voice synthesis with Sonic models

Real-Time Transcription

Deepgram integration for speech-to-text
Real-time streaming via WebSocket
Multiple language support
Confidence scoring and interim results

Knowledge Base Management

Upload multiple file formats (PDF, Word, Excel, CSV, Text)
Fetch content from websites
Semantic search with ChromaDB
Automatic text chunking and embedding
Agent-specific knowledge base associations

Call Management

Outbound call initiation via Twilio
Real-time audio streaming
Automatic call queuing
Working hours scheduling
Call recording and metadata tracking

Analytics & Monitoring

Comprehensive call metrics
Cost tracking per call, agent, and time period
Performance trends and patterns
Customer satisfaction ratings
AI-powered conversation summaries

Industry Templates

Pre-built templates for:

Sales and Customer Service
Healthcare and Medical
Real Estate
Education
Financial Services
Technical Support
And more...

Each template includes customizable system prompts, personality traits, and operational guardrails.

Database Schema

MongoDB Collections

agents

{
  _id: ObjectId,
  name: String,
  description: String,
  userId: String,
  organizationId: String,
  phone_number: String,
  status: String,
  config: {
    llm_provider: String,
    model: String,
    temperature: Number,
    max_tokens: Number,
    system_prompt: String,
    voice_provider: String,
    voice_id: String,
    // ... more config fields
  },
  analytics: {
    total_calls: Number,
    successful_calls: Number,
    total_duration: Number,
    total_cost: Number
  },
  created_at: Date,
  updated_at: Date
}

conversations

{
  _id: ObjectId,
  agent_id: ObjectId|String,
  session_id: String,
  userId: String,
  messages: [{
    role: String,
    content: String,
    timestamp: Date,
    metadata: Object
  }],
  call_metadata: {
    call_sid: String,
    from_number: String,
    to_number: String,
    duration: Number,
    status: String,
    recording_url: String,
    costs: {
      llm_cost: Number,
      voice_cost: Number,
      telephony_cost: Number,
      total_cost: Number
    }
  },
  summary: String,
  evaluation_score: Number,
  created_at: Date
}

knowledge_bases

{
  _id: ObjectId,
  name: String,
  description: String,
  owner_id: String,
  associated_agents: [String],
  document_count: Number,
  collection_name: String,
  created_at: Date
}

voices

{
  _id: ObjectId,
  name: String,
  voice_id: String,
  description: String,
  category: String,
  gender: String,
  language: String,
  is_custom: Boolean,
  userId: String,
  provider: String,
  settings: {
    stability: Number,
    similarity_boost: Number,
    style: Number
  },
  usage_statistics: {
    usage_count: Number,
    total_characters: Number
  },
  created_at: Date
}

ChromaDB Collections

Knowledge bases are stored as ChromaDB collections with:

Document embeddings (vector format)
Text content chunks
Metadata (source, file type, chunk index)
Similarity scores for retrieval

Deployment

Docker Deployment

FROM python:3.10-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install -r requirements.txt

COPY . .

CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "3000"]

Build and run:

docker build -t ai-agent-platform .
docker run -p 3000:3000 --env-file .env ai-agent-platform

Docker Compose

version: '3.8'

services:
  api:
    build: .
    ports:
      - "3000:3000"
    environment:
      - MONGODB_URL=mongodb://mongo:27017/ai_agents
      - REDIS_URL=redis://redis:6379
    depends_on:
      - mongo
      - redis
    
  mongo:
    image: mongo:5.0
    ports:
      - "27017:27017"
    volumes:
      - mongo_data:/data/db
    
  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"
    volumes:
      - redis_data:/data

volumes:
  mongo_data:
  redis_data:

Run with:

docker-compose up -d

Environment-Specific Configurations

Development:

DEBUG=true
FASTAPI_PORT=3000
JWT_ACCESS_SECRET=dev-secret

Production:

DEBUG=false
FASTAPI_PORT=3000
# Use strong secrets and secure URLs
JWT_ACCESS_SECRET=<generate-strong-secret>
JWT_REFRESH_SECRET=<generate-strong-secret>
WEBHOOK_BASE_URL=https://your-production-domain.com

Security Best Practices

API Keys: Store in environment variables, never commit to version control
CORS: Configure appropriately for your domain
JWT Secrets: Use strong, randomly generated secrets
HTTPS: Always use HTTPS in production
Rate Limiting: Implement at load balancer level
Database: Use authentication and run in private network
Logging: Avoid logging sensitive information

👨‍💻 Development

Project Structure

app/
├── __init__.py
├── config.py              # Configuration management
├── database.py            # Database initialization
├── main.py               # FastAPI app setup
├── agents/               # Agent management module
├── conversations/        # Conversation management
├── knowledge/            # Knowledge base (RAG)
├── voice/                # Voice services
├── telephony/            # Telephony system
├── dashboard/            # Analytics dashboard
├── auth/                 # Authentication
└── data/                 # Data files (pricing, etc.)

Running Tests

pip install pytest pytest-asyncio httpx

pytest tests/ -v

pytest tests/ --cov=app --cov-report=html

Contributing

Create a feature branch: git checkout -b feature/my-feature
Make changes and commit: git commit -am 'Add feature'
Push to branch: git push origin feature/my-feature
Submit a pull request

Code Style

Use Black for formatting: black app/
Use isort for imports: isort app/
Lint with Flake8: flake8 app/

Debugging

Enable verbose logging:

logging.basicConfig(level=logging.DEBUG)

Access debug endpoints:

/api/telephony/debug/agent/{agent_id}
/api/telephony/debug/validate-numbers
/api/telephony/debug/call-state/{call_sid}

Troubleshooting

MongoDB Connection Issues

mongosh --eval "db.adminCommand('ping')"

# Verify connection string in .env
# Default: mongodb://localhost:27017/ai_agents

Redis Connection Issues

redis-cli ping

# Verify connection string
# Default: redis://localhost:6379

ChromaDB Issues

ls -la ./chroma_data

# For Chroma Cloud, verify credentials
# Ensure CHROMA_CLOUD_API_KEY, CHROMA_CLOUD_TENANT, CHROMA_CLOUD_DATABASE are set

Twilio Integration Issues

Verify TWILIO_ACCOUNT_SID and TWILIO_AUTH_TOKEN
Ensure phone numbers are verified in Twilio Console
Check webhook URL is accessible: WEBHOOK_BASE_URL
Review Twilio logs for error details

API Key Issues

curl https://api.openai.com/v1/models \
  -H "Authorization: Bearer $OPENAI_API_KEY"

Common Errors

Error	Solution
"API key not configured"	Check environment variables
"Connection refused"	Verify MongoDB/Redis running
"Phone number not verified"	Add to Twilio verified caller IDs
"Knowledge base not found"	Verify KB ID and user ownership
"Voice not found"	Check voice provider configuration

Additional Resources

Support

For issues, questions, or feature requests:

Check existing GitHub issues
Review documentation and troubleshooting guide
Enable debug logging and check application logs
Contact the development team

Acknowledgments

Built with:

FastAPI
MongoDB & Motor
ChromaDB
Twilio
Deepgram
OpenAI, Google Gemini, Anthropic Claude
ElevenLabs, Cartesia
And many more open-source libraries

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
app		app
chroma		chroma
chroma_data		chroma_data
data		data
docs		docs
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt

ajitashwath/callmind

Folders and files

Latest commit

History

Repository files navigation

Call Mind: AI Telephony Based Agent Platform

Quick Start

Prerequisites

Installation

Environment Configuration

Running the Application

Table of Contents

Architecture Overview

Core Modules

1. Agents (app/agents/)

2. Conversations (app/conversations/)

3. Knowledge Base (app/knowledge/)

4. Voice Services (app/voice/)

5. Telephony (app/telephony/)

6. Dashboard (app/dashboard/)

🌐 API Endpoints

Authentication

Agents

Conversations

Knowledge Base

Voice Services

Telephony

Dashboard

System

Features

Multi-LLM Support

Voice Integration

Real-Time Transcription

Knowledge Base Management

Call Management

Analytics & Monitoring

Industry Templates

Database Schema

MongoDB Collections

agents

conversations

knowledge_bases

voices

ChromaDB Collections

Deployment

Docker Deployment

Docker Compose

Environment-Specific Configurations

Security Best Practices

👨‍💻 Development

Project Structure

Running Tests

Contributing

Code Style

Debugging

Troubleshooting

MongoDB Connection Issues

Redis Connection Issues

ChromaDB Issues

Twilio Integration Issues

API Key Issues

Common Errors

Additional Resources

Support

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Languages

1. Agents (`app/agents/`)

2. Conversations (`app/conversations/`)

3. Knowledge Base (`app/knowledge/`)

4. Voice Services (`app/voice/`)

5. Telephony (`app/telephony/`)

6. Dashboard (`app/dashboard/`)