Skip to content

devhims/opengpt

Repository files navigation

OpenGPT OpenGPT

Experiment with open-source AI models

Next.js React TypeScript Cloudflare Workers AI SDK Tailwind CSS License


A modern AI playground that combines the development experience of Next.js with the performance of Cloudflare Workers platform. Experiment with 50+ open-source AI models, including GPT-OSS, Leonardo, Llama, Qwen, Gemini, DeepSeek, and more. Features text-to-speech with multiple voices and real-time speech-to-text transcription.

OpenGPT Demo

Click to watch the demo video

Why OpenGPT?

πŸ† Best of Both Worlds

Development Experience πŸ’» + Deployment Performance ⚑

OpenGPT leverages three core technologies to deliver an exceptional AI development experience:

πŸ”§ Core Technologies

Technology What it brings Why it matters
πŸ”— OpenNext Seamless Next.js β†’ Cloudflare Workers deployment Deploy Next.js apps globally with the most affordable edge compute offering
πŸ€– AI SDK v5 Universal AI framework with streaming support Connect to any AI provider with type-safe, streaming APIs
☁️ Cloudflare Workers AI Global AI inference Sub-100ms latency worldwide with 50+ open-source models

🌟 Features

πŸ’¬ Multi-Modal AI Interface

  • Chat Mode: Conversational AI with 30+ text generation models
  • Image Mode: High-quality image generation with 5+ image models
  • Text-to-Speech (TTS): Voice synthesis with multiple speaker options
  • Speech-to-Text (STT): Real-time audio transcription with visual feedback
  • Seamless Switching: Toggle between modes without losing context

🧠 Advanced Reasoning Support

  • Thinking Process Visualization: See how AI models reason through problems
  • Collapsible Reasoning: Clean UI that shows/hides reasoning on demand
  • Universal Compatibility: Works with any AI model that supports reasoning tokens

🎨 Modern User Experience

  • AI Elements UI: Professional, accessible components built with AI Elements
  • Responsive Design: Mobile-first responsive interactions
  • Real-time Streaming: See responses as they're generated

πŸ”§ Developer Experience

  • Type Safety: Full TypeScript support
  • One-Command Deploy: pnpm deploy to Cloudflare Workers globally

πŸš€ Getting Started

Installation

# Clone the repository
git clone https://github.com/devhims/opengpt.git
cd opengpt

# Install dependencies
pnpm install

# Start development server
pnpm dev

Visit http://localhost:3000 to see OpenGPT in action! πŸŽ‰

Environment Setup

  1. Create .dev.vars for local development:
# .dev.vars (not committed to git)
NEXTJS_ENV=development
  1. For production secrets:
wrangler secret put NEXTJS_ENV

πŸ› οΈ Available Scripts

Command Description
pnpm dev Start development server with Turbopack
pnpm build Build the Next.js application
pnpm preview Preview the Cloudflare Workers build locally
pnpm deploy Build and deploy to Cloudflare Workers globally
pnpm lint Run ESLint with TypeScript rules
pnpm format Format code with Prettier
pnpm cf-typegen Generate Cloudflare binding types

πŸ€– Supported AI Models

Text Generation (30+ Models)

  • GPT-OSS: OpenAI-compatible 20B and 120B variants
  • Meta Llama: 4 Scout 17B, 3.3 70B, 3.1 family (6 variants), 3.2 family (3 variants), 3.0 family (3 variants)
  • Google Gemma: 3 12B IT, 7B IT, and LoRA variants (4 total)
  • Mistral: Small 3.1 24B, 7B v0.1/v0.2 variants (5 total)
  • Qwen: QWQ 32B, 2.5 Coder 32B, and 1.5 family variants (6 total)
  • DeepSeek: R1 Distill Qwen 32B, Math 7B, Coder variants (4 total)

Image Generation (5+ Models)

  • Black Forest Labs: FLUX-1-Schnell (fast, high-quality text-to-image)
  • Leonardo AI: Lucid Origin and Phoenix 1.0
  • Stability AI: Stable Diffusion XL Base 1.0
  • ByteDance: Stable Diffusion XL Lightning (ultra-fast generation)

Speech & Audio (3+ Models)

  • Text-to-Speech (TTS):
    • Deepgram Aura-1: 12+ expressive voices (Luna, Athena, Zeus, Angus, etc.)
    • MeloTTS: Multi-language support (EN, ES, FR, ZH, JP, KR) with regional accents
  • Speech-to-Text (STT):
    • Deepgram Nova-3: High-accuracy real-time transcription with punctuation

πŸ—οΈ Architecture

OpenGPT showcases a modern, production-ready architecture with comprehensive request handling:

flowchart TD
    User[πŸ‘€ User] --> UI[🎨 Next.js Frontend]
    UI --> ModeToggle{Mode Selection}

    ModeToggle -->|πŸ’¬ Chat| ChatPath[Chat Request Path]
    ModeToggle -->|πŸ–ΌοΈ Image| ImagePath[Image Request Path]
    ModeToggle -->|πŸ—£οΈ Speech| SpeechPath[Speech Request Path]

    ChatPath --> ChatAPI[πŸ“‘ /api/chat]
    ImagePath --> ImageAPI[πŸ“‘ /api/image]
    SpeechPath --> SpeechAPI["πŸ“‘ /api/speech-to-text | /api/text-to-speech"]

    ChatAPI --> RateLimit1[🚫 Rate Limiter]
    ImageAPI --> RateLimit2[🚫 Rate Limiter]
    SpeechAPI --> RateLimit3[🚫 Rate Limiter]

    RateLimit1 --> RateCheck1{Rate OK?}
    RateLimit2 --> RateCheck2{Rate OK?}
    RateLimit3 --> RateCheck3{Rate OK?}

    RateCheck1 -->|❌| RateError1[429 Error]
    RateCheck1 -->|βœ…| ChatProcessing[πŸ€– Chat Processing]

    RateCheck2 -->|❌| RateError2[429 Error]
    RateCheck2 -->|βœ…| ImageProcessing[🎨 Image Processing]

    RateCheck3 -->|❌| RateError3[429 Error]
    RateCheck3 -->|βœ…| SpeechProcessing[πŸ—£οΈ Speech Processing]

    ChatProcessing --> ModelType{Model Type}
    ModelType -->|Standard| AISDKPath[πŸ”§ AI SDK v5 + workers-ai-provider]
    ModelType -->|GPT-OSS| DirectPath[🎯 Direct env.AI.run]

    ImageProcessing --> ImageAI[🎨 Direct env.AI.run]
    SpeechProcessing --> SpeechAI[πŸ—£οΈ Direct env.AI.run]

    AISDKPath --> WorkersAI1[☁️ Cloudflare Workers AI]
    DirectPath --> WorkersAI2[☁️ Cloudflare Workers AI]
    ImageAI --> WorkersAI3[☁️ Cloudflare Workers AI]
    SpeechAI --> WorkersAI4[☁️ Cloudflare Workers AI]

    WorkersAI1 --> Streaming[🌊 Real-time Streaming]
    WorkersAI2 --> Batch[πŸ“‹ Batch Processing + Emulated Stream]
    WorkersAI3 --> ImageResponse[πŸ“Έ Generated Image]
    WorkersAI4 --> SpeechResponse[πŸ”Š Audio/Text Response]

    Streaming --> ParseReasoning[🧠 Parse Reasoning]
    Batch --> ParseReasoning

    ParseReasoning --> ChatSuccess[βœ… Chat Response]
    ImageResponse --> ImageSuccess[βœ… Image Response]
    SpeechResponse --> SpeechSuccess[βœ… Speech Response]

    RateError1 --> ErrorUI[🎨 Error Display]
    RateError2 --> ErrorUI
    RateError3 --> ErrorUI

    ChatSuccess --> ResponseUI[πŸ“₯ Response Display]
    ImageSuccess --> ResponseUI
    SpeechSuccess --> ResponseUI
Loading

Key Implementation Details

Chat Route Processing:

  • Standard Models: Uses AI SDK v5 with workers-ai-provider wrapper for streaming
  • GPT-OSS Models: Direct env.AI.run call with emulated streaming via createUIMessageStream
  • All models: Connect to the same Cloudflare Workers AI backend

Image Route Processing:

  • All Image Models: Direct env.AI.run call (no AI SDK wrapper needed)
  • Response Handling: Supports both base64 and binary stream responses
  • Format Conversion: Automatic conversion to both base64 and Uint8Array for frontend compatibility

Speech Route Processing:

  • Speech-to-Text: Direct env.AI.run call with @cf/deepgram/nova-3 model
  • Text-to-Speech: Direct env.AI.run call with @cf/deepgram/aura-1 or @cf/myshell-ai/melotts models
  • Audio Processing: WebM/MP4 audio file handling with automatic format detection
  • Voice Options: 12+ Aura-1 speakers, multi-language MeloTTS with regional accents

Rate Limiting:

  • Shared Infrastructure: Both routes use the same checkRateLimit utility
  • Per-endpoint Limits: Separate daily limits for chat (20), image (5), and speech (10) requests
  • Storage: Hybrid Upstash Redis + Cloudflare KV fallback

Request Processing Flow

  1. Frontend Validation: Client-side input validation and optional rate limit pre-checking
  2. Rate Limiting: IP-based daily limits (20 chat, 5 image, 10 speech requests) with Redis/KV storage
  3. Model Routing: Smart routing between Standard Models (streaming) and GPT-OSS Models (batch)
  4. AI Processing: Direct Cloudflare Workers AI integration with optimized parameters
  5. Response Handling: Reasoning token parsing, format conversion, and UI display

πŸš€ Deployment

# Build and deploy in one command
pnpm deploy

# Or step by step
pnpm build
npx wrangler deploy

Environment Variables

Variable Description
UPSTASH_REDIS_REST_URL Upstash Redis URL (optional)
UPSTASH_REDIS_REST_TOKEN Upstash Redis token (optional)

Adding New Models

  1. Add model to constants:
// src/constants/index.ts
export const CLOUDFLARE_AI_MODELS = {
  textGeneration: [
    // Add your new model here
    '@cf/vendor/new-model',
    // ... existing models
  ] as const,
  imageGeneration: [
    // For image models
  ] as const,
  speech: [
    // For speech-to-text models
  ] as const,
  textToSpeech: [
    // For text-to-speech models
  ] as const,
};
  1. Update utility functions:
// src/constants/index.ts
export function getTextGenerationModels(): readonly string[] {
  return CLOUDFLARE_AI_MODELS.textGeneration;
}

export function getSpeechModels(): readonly string[] {
  return CLOUDFLARE_AI_MODELS.speech;
}

export function getTextToSpeechModels(): readonly string[] {
  return CLOUDFLARE_AI_MODELS.textToSpeech;
}
  1. Test the integration:
pnpm dev
# Test the new model in the UI

🀝 Contributing

We welcome contributions!

Quick Start for Contributors

# Fork the repo and clone your fork
git clone https://github.com/devhims/opengpt.git

# Create a feature branch
git checkout -b feature/new-feature

# Make your changes and test
pnpm dev

# Run linting and formatting
pnpm lint
pnpm format

# Commit using conventional commits
git commit -m "feat: add new feature"

# Push and create a PR
git push origin feature/new-feature

Code Style

  • TypeScript: Strict mode enabled
  • Formatting: Prettier with Tailwind class sorting
  • Linting: ESLint with Next.js rules

πŸ“„ License

This project is licensed under the MIT License.

Made with ❀️ for the AI community

⭐ Star this repo if you find it useful!

Releases

No releases published

Packages

No packages published