Data Llama is a simple, autonomous agent system that can research this business concept online, synthesize the findings from multiple sources, and produce a single, reliable, and well-cited answer for the student .
It combines web research, AI summarization, and a chat-style UI to provide reliable, source-backed insights.
- Automated Research via Serper API + fallback search.
- LLM-based Synthesis with OpenRouter (Gemini, Llama, Grok, DeepSeek, etc.).
- Citations & Sources auto-formatted into superscripts.
- Interactive UI with chat history, model selection, markdown & math support.
- Robust Error Handling with retries, exponential backoff, and fallbacks.
flowchart TD
User[User Query] -->|1. Ask| Frontend[Frontend UI - HTML JS Chat]
Frontend -->|2. POST /ask| Backend[FastAPI Backend]
Backend -->|3. Research| Researcher[Researcher Agent - Serper API and Fallbacks]
Researcher -->|4. Extract Sources| Sources[(Relevant Sources)]
Backend -->|5. Synthesize| Synthesizer[Synthesizer Agent - OpenRouter LLMs]
Synthesizer -->|6. Format| Utils[Utils - Chunking and Citations]
Utils -->|7. Final Answer and Superscripts| Backend
Backend -->|8. Response| Frontend
Frontend -->|9. Display| User
Flow:
- User submits a query.
- Frontend sends request to
/ask. - Researcher finds & extracts high-quality sources.
- Synthesizer calls chosen LLM via OpenRouter.
- Utils handle chunking, citations, and formatting.
- Answer is returned with sources → displayed in chat.
- Serper API → Primary search + content extraction.
- Fallback to OpenRouter → LLM generates reputable URLs if Serper fails.
- Content Extraction → Using Serper Extract API or Newspaper3k.
- Reliability Filters → Prefers accessible, non-paywalled, reputable sites.
Supported through OpenRouter API:
- Google Gemini 2.0 Flash (default)
- Meta Llama 3.3 70B
- xAI Grok 4 Fast
- Microsoft MAI-DS R1
- DeepSeek Chat v3.0 / v3.1
- OpenAI GPT OSS 20B
- Mistral 7B & 24B
- Google Gemma 27B
Why multiple models?
- Flexibility: speed vs reasoning trade-offs.
- Fallbacks: avoids downtime from rate limits.
- Reliability: OpenRouter provides consistent API handling.
- Chunk content into smaller sections.
- Construct prompt with context, question, and sources.
- Call OpenRouter API with retries + backoff.
- Generate citations and insert superscripts.
- Fallback → If synthesis fails, return sources with summaries.
git clone https://github.com/anonymousknight07/Data-Lama-.git
cd Data-LamaCreate a .env file:
OPENROUTER_API_KEY=your_openrouter_api_key
SERPER_API_KEY=your_serper_api_key
HOST=127.0.0.1
PORT=8000pip install -r requirements.txtbash run.shOr manually:
uvicorn app.main:app --reloadGo to:
http://127.0.0.1:8000
See requirements.txt
Includes:
fastapi,uvicorn,requests,newspaper3k,jinja2,beautifulsoup4, etc.
Test server status:
curl http://127.0.0.1:8000/healthMIT License – free to use and modify.
