title | emoji | colorFrom | colorTo | sdk | sdk_version | app_file | pinned |
---|---|---|---|---|---|---|---|
AI Lecture Forge |
📚 |
blue |
green |
gradio |
4.0.0 |
src/app.py |
false |
🏆 Production-ready AI system for transforming conversational transcripts into structured teaching material.
AI Lecture Forge is a state-of-the-art system that leverages multiple AI models to transform raw transcripts into professional teaching materials. It supports both local inference and API-based models, with optional Text-to-Speech capabilities.
- Local Models (No API key required):
- Microsoft Phi-4 (default)
- NovaSky Sky-T1-32B
- DeepSeek V3
- OpenAI Models (API key required):
- Default: gpt-4o-mini
- Supports all OpenAI models
- Google Models (API key required):
- Default: gemini-2.0-flash-exp
- Supports all Gemini models
- Kokoro-82M voice synthesis
- High-quality audio generation
- Multiple voice options
- PDF transcript processing
- Customizable lecture duration (30-60 minutes)
- Practical examples integration
- Structured output with sections
- Real-time processing
- Python 3.8+
- PyTorch + Transformers: Local inference with multiple models
- Microsoft Phi-4 (default)
- NovaSky Sky-T1-32B
- DeepSeek V3
- Text-to-Speech: Kokoro-82M for voice generation
- Gradio 4.0.0: Web interface framework
- OpenAI API (optional): Multiple models
- Default: gpt-4o-mini
- Supports all OpenAI models
- Google Gemini API (optional): Multiple models
- Default: gemini-2.0-flash-exp
- Supports all Gemini models
- PyPDF2: PDF processing
- python-dotenv: Environment management
- tiktoken: Token counting for OpenAI
- tqdm: Progress bars
- numpy: Numerical operations
# Clone repository
git clone [repository-url]
cd ai-lecture-forge
# Install dependencies
pip install -r requirements.txt
# Optional: Configure API keys
# Create .env file if using API models
touch .env
echo "OPENAI_API_KEY=your_key" >> .env
echo "GOOGLE_API_KEY=your_key" >> .env
# Run application
python src/app.py
- Create new Space
- Select SDK: Gradio
- Hardware: T4 GPU (recommended)
- Connect repository
- Optional: Add API keys in Settings
- OPENAI_API_KEY (if using OpenAI models)
- GOOGLE_API_KEY (if using Gemini models)
There are two ways to provide API keys:
When using API models, the interface will show secure input fields for:
- OpenAI API Key (when using OpenAI models)
- Gemini API Key (when using Gemini models)
This is the recommended method for:
- Public deployments
- Hugging Face Spaces
- Shared instances
For local development or private deployments:
# Optional: Create .env file
OPENAI_API_KEY=your_openai_key
GOOGLE_API_KEY=your_gemini_key
Note: If both methods are used, keys provided through the interface take precedence.
The .env
file is optional and only required when using API-based models:
# Required for OpenAI models
OPENAI_API_KEY=your_openai_key
# Required for Gemini models
GOOGLE_API_KEY=your_gemini_key
Note: Local models (Phi-4, Sky-T1-32B, DeepSeek V3) work without any API keys.
- Best for: Development, testing, offline use
- No API costs
- GPU recommended for better performance
- Best for: Production, high-quality output
- Requires API keys
- Pay-per-use pricing
- Optional feature
- Adds voice synthesis capability
- Uses Kokoro-82M model
- GPU recommended for local models
- T4 GPU or better for optimal performance
- CPU-only mode available but slower
- API models not affected by hardware
See CONTRIBUTING.md for guidelines.
MIT License - See LICENSE for details.
This project is designed for production environments and competitive scenarios:
- Production-ready error handling
- Scalable architecture
- Multiple model fallbacks
- Comprehensive logging
- Performance optimization