Experiment with open-source AI models
A modern AI playground that combines the development experience of Next.js with the performance of Cloudflare Workers platform. Experiment with 50+ open-source AI models, including GPT-OSS, Leonardo, Llama, Qwen, Gemini, DeepSeek, and more. Features text-to-speech with multiple voices and real-time speech-to-text transcription.
OpenGPT leverages three core technologies to deliver an exceptional AI development experience:
Technology | What it brings | Why it matters |
---|---|---|
π OpenNext | Seamless Next.js β Cloudflare Workers deployment | Deploy Next.js apps globally with the most affordable edge compute offering |
π€ AI SDK v5 | Universal AI framework with streaming support | Connect to any AI provider with type-safe, streaming APIs |
βοΈ Cloudflare Workers AI | Global AI inference | Sub-100ms latency worldwide with 50+ open-source models |
- Chat Mode: Conversational AI with 30+ text generation models
- Image Mode: High-quality image generation with 5+ image models
- Text-to-Speech (TTS): Voice synthesis with multiple speaker options
- Speech-to-Text (STT): Real-time audio transcription with visual feedback
- Seamless Switching: Toggle between modes without losing context
- Thinking Process Visualization: See how AI models reason through problems
- Collapsible Reasoning: Clean UI that shows/hides reasoning on demand
- Universal Compatibility: Works with any AI model that supports reasoning tokens
- AI Elements UI: Professional, accessible components built with AI Elements
- Responsive Design: Mobile-first responsive interactions
- Real-time Streaming: See responses as they're generated
- Type Safety: Full TypeScript support
- One-Command Deploy:
pnpm deploy
to Cloudflare Workers globally
# Clone the repository
git clone https://github.com/devhims/opengpt.git
cd opengpt
# Install dependencies
pnpm install
# Start development server
pnpm dev
Visit http://localhost:3000 to see OpenGPT in action! π
- Create
.dev.vars
for local development:
# .dev.vars (not committed to git)
NEXTJS_ENV=development
- For production secrets:
wrangler secret put NEXTJS_ENV
Command | Description |
---|---|
pnpm dev |
Start development server with Turbopack |
pnpm build |
Build the Next.js application |
pnpm preview |
Preview the Cloudflare Workers build locally |
pnpm deploy |
Build and deploy to Cloudflare Workers globally |
pnpm lint |
Run ESLint with TypeScript rules |
pnpm format |
Format code with Prettier |
pnpm cf-typegen |
Generate Cloudflare binding types |
- GPT-OSS: OpenAI-compatible 20B and 120B variants
- Meta Llama: 4 Scout 17B, 3.3 70B, 3.1 family (6 variants), 3.2 family (3 variants), 3.0 family (3 variants)
- Google Gemma: 3 12B IT, 7B IT, and LoRA variants (4 total)
- Mistral: Small 3.1 24B, 7B v0.1/v0.2 variants (5 total)
- Qwen: QWQ 32B, 2.5 Coder 32B, and 1.5 family variants (6 total)
- DeepSeek: R1 Distill Qwen 32B, Math 7B, Coder variants (4 total)
- Black Forest Labs: FLUX-1-Schnell (fast, high-quality text-to-image)
- Leonardo AI: Lucid Origin and Phoenix 1.0
- Stability AI: Stable Diffusion XL Base 1.0
- ByteDance: Stable Diffusion XL Lightning (ultra-fast generation)
- Text-to-Speech (TTS):
- Deepgram Aura-1: 12+ expressive voices (Luna, Athena, Zeus, Angus, etc.)
- MeloTTS: Multi-language support (EN, ES, FR, ZH, JP, KR) with regional accents
- Speech-to-Text (STT):
- Deepgram Nova-3: High-accuracy real-time transcription with punctuation
OpenGPT showcases a modern, production-ready architecture with comprehensive request handling:
flowchart TD
User[π€ User] --> UI[π¨ Next.js Frontend]
UI --> ModeToggle{Mode Selection}
ModeToggle -->|π¬ Chat| ChatPath[Chat Request Path]
ModeToggle -->|πΌοΈ Image| ImagePath[Image Request Path]
ModeToggle -->|π£οΈ Speech| SpeechPath[Speech Request Path]
ChatPath --> ChatAPI[π‘ /api/chat]
ImagePath --> ImageAPI[π‘ /api/image]
SpeechPath --> SpeechAPI["π‘ /api/speech-to-text | /api/text-to-speech"]
ChatAPI --> RateLimit1[π« Rate Limiter]
ImageAPI --> RateLimit2[π« Rate Limiter]
SpeechAPI --> RateLimit3[π« Rate Limiter]
RateLimit1 --> RateCheck1{Rate OK?}
RateLimit2 --> RateCheck2{Rate OK?}
RateLimit3 --> RateCheck3{Rate OK?}
RateCheck1 -->|β| RateError1[429 Error]
RateCheck1 -->|β
| ChatProcessing[π€ Chat Processing]
RateCheck2 -->|β| RateError2[429 Error]
RateCheck2 -->|β
| ImageProcessing[π¨ Image Processing]
RateCheck3 -->|β| RateError3[429 Error]
RateCheck3 -->|β
| SpeechProcessing[π£οΈ Speech Processing]
ChatProcessing --> ModelType{Model Type}
ModelType -->|Standard| AISDKPath[π§ AI SDK v5 + workers-ai-provider]
ModelType -->|GPT-OSS| DirectPath[π― Direct env.AI.run]
ImageProcessing --> ImageAI[π¨ Direct env.AI.run]
SpeechProcessing --> SpeechAI[π£οΈ Direct env.AI.run]
AISDKPath --> WorkersAI1[βοΈ Cloudflare Workers AI]
DirectPath --> WorkersAI2[βοΈ Cloudflare Workers AI]
ImageAI --> WorkersAI3[βοΈ Cloudflare Workers AI]
SpeechAI --> WorkersAI4[βοΈ Cloudflare Workers AI]
WorkersAI1 --> Streaming[π Real-time Streaming]
WorkersAI2 --> Batch[π Batch Processing + Emulated Stream]
WorkersAI3 --> ImageResponse[πΈ Generated Image]
WorkersAI4 --> SpeechResponse[π Audio/Text Response]
Streaming --> ParseReasoning[π§ Parse Reasoning]
Batch --> ParseReasoning
ParseReasoning --> ChatSuccess[β
Chat Response]
ImageResponse --> ImageSuccess[β
Image Response]
SpeechResponse --> SpeechSuccess[β
Speech Response]
RateError1 --> ErrorUI[π¨ Error Display]
RateError2 --> ErrorUI
RateError3 --> ErrorUI
ChatSuccess --> ResponseUI[π₯ Response Display]
ImageSuccess --> ResponseUI
SpeechSuccess --> ResponseUI
Chat Route Processing:
- Standard Models: Uses AI SDK v5 with
workers-ai-provider
wrapper for streaming - GPT-OSS Models: Direct
env.AI.run
call with emulated streaming viacreateUIMessageStream
- All models: Connect to the same Cloudflare Workers AI backend
Image Route Processing:
- All Image Models: Direct
env.AI.run
call (no AI SDK wrapper needed) - Response Handling: Supports both base64 and binary stream responses
- Format Conversion: Automatic conversion to both base64 and Uint8Array for frontend compatibility
Speech Route Processing:
- Speech-to-Text: Direct
env.AI.run
call with@cf/deepgram/nova-3
model - Text-to-Speech: Direct
env.AI.run
call with@cf/deepgram/aura-1
or@cf/myshell-ai/melotts
models - Audio Processing: WebM/MP4 audio file handling with automatic format detection
- Voice Options: 12+ Aura-1 speakers, multi-language MeloTTS with regional accents
Rate Limiting:
- Shared Infrastructure: Both routes use the same
checkRateLimit
utility - Per-endpoint Limits: Separate daily limits for chat (20), image (5), and speech (10) requests
- Storage: Hybrid Upstash Redis + Cloudflare KV fallback
- Frontend Validation: Client-side input validation and optional rate limit pre-checking
- Rate Limiting: IP-based daily limits (20 chat, 5 image, 10 speech requests) with Redis/KV storage
- Model Routing: Smart routing between Standard Models (streaming) and GPT-OSS Models (batch)
- AI Processing: Direct Cloudflare Workers AI integration with optimized parameters
- Response Handling: Reasoning token parsing, format conversion, and UI display
# Build and deploy in one command
pnpm deploy
# Or step by step
pnpm build
npx wrangler deploy
Variable | Description |
---|---|
UPSTASH_REDIS_REST_URL |
Upstash Redis URL (optional) |
UPSTASH_REDIS_REST_TOKEN |
Upstash Redis token (optional) |
- Add model to constants:
// src/constants/index.ts
export const CLOUDFLARE_AI_MODELS = {
textGeneration: [
// Add your new model here
'@cf/vendor/new-model',
// ... existing models
] as const,
imageGeneration: [
// For image models
] as const,
speech: [
// For speech-to-text models
] as const,
textToSpeech: [
// For text-to-speech models
] as const,
};
- Update utility functions:
// src/constants/index.ts
export function getTextGenerationModels(): readonly string[] {
return CLOUDFLARE_AI_MODELS.textGeneration;
}
export function getSpeechModels(): readonly string[] {
return CLOUDFLARE_AI_MODELS.speech;
}
export function getTextToSpeechModels(): readonly string[] {
return CLOUDFLARE_AI_MODELS.textToSpeech;
}
- Test the integration:
pnpm dev
# Test the new model in the UI
We welcome contributions!
# Fork the repo and clone your fork
git clone https://github.com/devhims/opengpt.git
# Create a feature branch
git checkout -b feature/new-feature
# Make your changes and test
pnpm dev
# Run linting and formatting
pnpm lint
pnpm format
# Commit using conventional commits
git commit -m "feat: add new feature"
# Push and create a PR
git push origin feature/new-feature
- TypeScript: Strict mode enabled
- Formatting: Prettier with Tailwind class sorting
- Linting: ESLint with Next.js rules
This project is licensed under the MIT License.
Made with β€οΈ for the AI community
β Star this repo if you find it useful!