Your Repository's Health Specialist
Multi-agent AI platform that analyzes GitHub repositories and provides actionable recommendations to improve documentation quality, metadata, and discoverability.
DrRepo uses 5 specialized AI agents powered by LangGraph to comprehensively analyze your GitHub repository and provide professional recommendations:
- π Repository Analysis - Metadata, structure, and organization
- π·οΈ Metadata Optimization - Discoverability and SEO
- βοΈ Content Enhancement - README quality and completeness
- β Quality Assessment - Professional standards review
- π Fact Checking - Claims verification with RAG
- Multi-Agent System: 5 specialized AI agents working together
- LangGraph Orchestration: Sophisticated workflow management
- RAG-Enhanced: FAISS vector search for fact-checking
- Free LLM: Uses Groq (llama-3.3-70b) - no OpenAI needed
- Quality Scoring: 0-100 score with detailed breakdown
- Priority Actions: Ranked recommendations by impact
- Web Interface: Beautiful Streamlit UI
- CLI Support: Command-line interface for automation
- JSON Export: Complete analysis reports
- Python 3.9 or higher
- pip package manager
- Git
Clone repository
git clone https://github.com/ak-rahul/DrRepo.git
cd DrRepo
Create virtual environment
python -m venv venv
Activate virtual environment Windows:
venv\Scripts\activate
Linux/Mac:
source venv/bin/activate
Install dependencies
pip install -r requirements.txt
Create a .env file in the project root:
Groq API (Free - get from https://console.groq.com)
GROQ_API_KEY=your_groq_api_key_here
GitHub Token (get from https://github.com/settings/tokens)
GH_TOKEN=your_github_token_here
Tavily Search API (get from https://app.tavily.com)
TAVILY_API_KEY=your_tavily_api_key_here
streamlit run app.py
Open your browser at http://localhost:8501
python -m src.main https://github.com/psf/requests "Python HTTP library"
- Start the Streamlit app:
streamlit run app.py - Enter a GitHub repository URL
- (Optional) Add a description for better context
- Click "Analyze Repository"
- View comprehensive results and download JSON report
from src.main import PublicationAssistant
Initialize
assistant = PublicationAssistant()
Analyze repository
result = assistant.analyze(
repo_url="https://github.com/fastapi/fastapi",
description="Modern Python web framework"
)
Access results
print(f"Quality Score: {result['repository']['current_score']:.1f}/100")
print(f"Status: {result}")
Print top recommendations
for item in result['action_items'][:3]:
print(item)
Basic usage
python -m src.main <repo_url>
With description
python -m src.main <repo_url> "<description>"
Example
python -m src.main https://github.com/django/django "Python web framework"
DrRepo includes built-in health monitoring for production deployments.
The Streamlit interface includes a System Health panel in the sidebar that shows:
- β LLM API status (Groq/OpenAI)
- β GitHub API status with rate limit info
- β Tavily Search API status
- β RAG retriever (FAISS) status
- β Response latency for each component
Click the π Refresh Health Status button to update.
For production monitoring, DrRepo provides a FastAPI health check endpoint:
Start the health API server:
python scripts/run_health_api.py
Health Check Endpoints:
| Endpoint | Purpose | Response Time |
|---|---|---|
GET /health |
Comprehensive health check with all component details | ~2-5s |
GET /health/simple |
Quick health status (no component checks) | <100ms |
GET /health/components |
Individual component status only | ~2-5s |
GET /health/ready |
Kubernetes readiness probe | <100ms |
GET /health/live |
Kubernetes liveness probe | <50ms |
Example Response (/health):
{
"status": "healthy",
"timestamp": "2025-12-19T08:00:00Z",
"version": "1.0.0",
"provider": "groq",
"components": {
"llm_groq": {
"status": "up",
"latency_ms": 120,
"model": "llama-3.3-70b-versatile"
},
"github_api": {
"status": "up",
"latency_ms": 85,
"rate_limit_remaining": 4500,
"rate_limit_total": 5000
},
"tavily_api": {
"status": "up",
"latency_ms": 200
},
"rag_retriever": {
"status": "up",
"latency_ms": 45,
"embeddings_model": "sentence-transformers/all-MiniLM-L6-v2"
}
}
}
HTTP Status Codes:
200 OK: All systems healthy503 Service Unavailable: One or more components degraded/down
- LangGraph: Multi-agent workflow orchestration
- Groq: Fast, free LLM inference (llama-3.3-70b)
- FAISS: Vector search for fact-checking
- HuggingFace: Embeddings for RAG
- PyGithub: GitHub API integration
- Tavily: Web search for best practices
- Streamlit: Web interface
Each of the 5 AI agents has a distinct, non-overlapping role:
| Agent | Unique Responsibility | What Sets It Apart |
|---|---|---|
| π RepoAnalyzer | Extract repository facts & metadata | Only agent with direct GitHub API access; provides data foundation for all others |
| π·οΈ MetadataRecommender | Optimize discoverability & SEO | Only agent that researches competitor repositories for benchmarking |
| βοΈ ContentImprover | Enhance README structure & content | Only agent that retrieves external best practices documentation |
| β ReviewerCritic | Audit quality with structured scoring | Only agent that provides 4-dimension scoring (Completeness, Clarity, Professionalism, Discoverability) |
| π FactChecker | Verify claims with evidence | Only agent using RAG/FAISS vector search to validate README statements |
Key Distinction: Each agent uses different tools and temperature settings optimized for its specific task, ensuring specialized expertise rather than redundant analysis.
- Stars, forks, watchers
- Language and topics
- License information
- Last updated date
- Content completeness (0-100 score)
- Structure and organization
- Code examples count
- Visual elements (images, badges)
- Missing sections identification
- Test presence
- CI/CD configuration
- Contributing guidelines
- Clarity and readability
- Professional standards
- Best practices compliance
- Claim verification (with RAG)
| Score | Status | Meaning |
|---|---|---|
| 80-100 | Excellent | Professional, complete documentation |
| 60-79 | Good | Solid documentation, minor improvements needed |
| 40-59 | Needs Improvement | Significant gaps, needs work |
| 0-39 | Poor | Critical issues, major overhaul needed |
Scoring Factors:
- Word count (20 points)
- Section structure (20 points)
- Code examples (15 points)
- Visual elements (10 points)
- Extras (badges, TOC, links): +20 points
- Missing critical sections: -30 points
| Category | Technology |
|---|---|
| Orchestration | LangGraph 0.2.28+ |
| LLM | Groq (llama-3.3-70b-versatile) |
| Framework | LangChain 0.3.0+ |
| Vector DB | FAISS (CPU) |
| Embeddings | HuggingFace Sentence Transformers |
| APIs | PyGithub, Tavily Search |
| Frontend | Streamlit 1.31.0+ |
| Testing | pytest, pytest-cov |
| CI/CD | GitHub Actions |
Run all tests
pytest tests/ -v
Run with coverage
pytest tests/ -v --cov=src --cov-report=html
Run specific test file
pytest tests/test_agents/test_repo_analyzer.py -v
Run integration tests
pytest tests/ -v -m integration
Build image
docker build -t drrepo:latest .
Run container
docker run -p 8501:8501 --env-file .env drrepo:latest
Or use docker-compose
docker-compose up
We welcome contributions! Please see our Contributing Guidelines for details.
Ways to Contribute:
- π Report bugs
- π‘ Suggest features
- π Improve documentation
- π§ Submit pull requests
- β Star this repository
This project is licensed under the MIT License - see the LICENSE file for details.
- LangChain & LangGraph for the amazing agent framework
- Groq for free, fast LLM inference
- HuggingFace for open-source embeddings
- GitHub for the comprehensive API
- Streamlit for the beautiful web framework
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Parallel agent execution for faster analysis
- Batch repository processing
- Persistent vector store
- API endpoint deployment
- Support for private repositories
- Custom agent configuration
- Multi-language README support
- Comparison mode for multiple repositories
