A single-page Streamlit app that helps recruiters evaluate and rank resumes against a Job Description (JD) using Google Gemini, LangChain, and ChromaDB.
- Upload Job Descriptions - Paste or type the JD for the position
- Upload Multiple Resumes - Support for PDF and DOCX formats (max 5MB each)
- AI-Powered Evaluation - Uses Google Gemini 2.0 Flash for intelligent scoring
- Hybrid Scoring System - Combines LLM analysis (60%) with embedding similarity (40%)
- Skills Analysis - Identifies matching and missing skills for each candidate
- Match Score & Summary - 0-100 score with detailed fit summary
- Export to CSV - Download evaluation results for further analysis
- Session History - Clickable sidebar with recent evaluations
- Persistent Storage - Optional Supabase integration for cross-session history
- View & Delete Past Evaluations - Manage your evaluation history
- Mobile Responsive - Works on desktop and mobile devices
- Reset Functionality - Clear form and start fresh
| Component | Technology | Purpose |
|---|---|---|
| Frontend | Streamlit | Interactive web UI |
| LLM | Google Gemini 2.0 Flash | Resume evaluation & scoring |
| LLM Framework | LangChain | Prompt orchestration & output parsing |
| Embeddings | sentence-transformers (all-MiniLM-L6-v2) | Text vectorization |
| Vector Store | ChromaDB | In-memory similarity search |
| Database | Supabase (PostgreSQL) | Optional persistent storage |
| Document Parsing | PyPDF2, python-docx | Resume text extraction |
resume-screening-agent/
├── app.py # Streamlit entrypoint
├── requirements.txt # Python dependencies
├── README.md # Documentation
├── LICENSE # MIT License
└── src/
├── __init__.py
├── config.py # Environment loader & settings
├── ai/
│ ├── __init__.py
│ ├── prompts.py # LangChain prompt templates & Pydantic models
│ ├── evaluator.py # Core evaluation logic (embeddings + LLM)
│ └── embeddings.py # ChromaDB manager & embedding functions
├── db/
│ ├── __init__.py
│ ├── supabase_client.py # Supabase client wrapper (optional)
│ └── models.py # Database model helpers (CRUD operations)
└── ui/
├── __init__.py
└── components.py # Streamlit UI components & app logic
- Python 3.9+
- Google API Key (Get one from Google AI Studio)
- Clone the repository
git clone https://github.com/anuragparashar26/resume-screening-agent.git
cd resume-screening-agent- Create virtual environment and install dependencies
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install -r requirements.txt-
Configure Google API Key
For local development:
# Create secrets file cp .streamlit/secrets.toml.example .streamlit/secrets.toml # Edit .streamlit/secrets.toml and add your API key
For Streamlit Cloud deployment:
- Go to your app's dashboard → Settings → Secrets
- Add:
GOOGLE_API_KEY = "your-google-api-key-here"
-
Run the application
streamlit run app.pyIf you want evaluation history to persist across sessions, set up your own Supabase:
- Create a
.envfile in the project root:
SUPABASE_URL=https://your-project.supabase.co
SUPABASE_ANON_KEY=your-supabase-anon-key- Run this SQL in your Supabase SQL editor:
-- Create tables
CREATE TABLE IF NOT EXISTS evaluations (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
job_title TEXT,
job_description TEXT NOT NULL,
created_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE TABLE IF NOT EXISTS evaluation_results (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
evaluation_id UUID REFERENCES evaluations(id) ON DELETE CASCADE,
candidate_name TEXT,
score NUMERIC,
summary TEXT,
matching_skills JSONB,
missing_skills JSONB,
created_at TIMESTAMPTZ DEFAULT NOW()
);
-- Enable Row Level Security
ALTER TABLE evaluations ENABLE ROW LEVEL SECURITY;
ALTER TABLE evaluation_results ENABLE ROW LEVEL SECURITY;
-- Create policies for anonymous access
CREATE POLICY "Allow anonymous insert on evaluations" ON evaluations
FOR INSERT TO anon WITH CHECK (true);
CREATE POLICY "Allow anonymous select on evaluations" ON evaluations
FOR SELECT TO anon USING (true);
CREATE POLICY "Allow anonymous delete on evaluations" ON evaluations
FOR DELETE TO anon USING (true);
CREATE POLICY "Allow anonymous insert on evaluation_results" ON evaluation_results
FOR INSERT TO anon WITH CHECK (true);
CREATE POLICY "Allow anonymous select on evaluation_results" ON evaluation_results
FOR SELECT TO anon USING (true);
CREATE POLICY "Allow anonymous delete on evaluation_results" ON evaluation_results
FOR DELETE TO anon USING (true);- Ensure Google API Key is configured (via Streamlit Cloud secrets or local secrets.toml)
- Enter a Job Description and optional Job Title in the main form
- Upload resumes - Drag and drop or browse for PDF/DOCX files (max 5MB each)
- Click "Evaluate Resumes" to run AI-powered scoring
- View ranked results with scores, summaries, and skill analysis
- Download CSV for offline analysis or sharing
- Click history items in the sidebar to view past evaluations
- Use "Reset" to clear the form and start fresh
| Mode | Description | Data Persistence |
|---|---|---|
| Session (default) | History stored in Streamlit session_state | Cleared on page refresh |
| Supabase (optional) | User provides their own Supabase credentials | Persists across sessions |
Resume Upload → Text Extraction → Embedding Generation → Similarity + LLM Analysis → Final Score
- Document Parsing: PyPDF2 (PDF) and python-docx (DOCX) extract text from uploaded resumes
- Embedding Generation: sentence-transformers (all-MiniLM-L6-v2) converts text to 384-dimensional vectors
- Similarity Computation: ChromaDB calculates cosine similarity between job description and resumes
- LLM Evaluation: Google Gemini 2.0 Flash analyzes each resume against the JD using LangChain:
- Generates a score (0-100)
- Writes a fit summary (3-5 sentences)
- Identifies matching skills
- Identifies missing skills
- Score Calculation: Final score =
(0.6 × LLM Score) + (0.4 × Similarity × 100) - Ranking: Candidates sorted by final score in descending order
- Embedding similarity captures semantic relevance at scale
- LLM analysis provides nuanced understanding of qualifications
- Combined approach balances speed with accuracy
- LLM responses may occasionally be inaccurate; human review recommended
- Session history clears on page refresh (use Supabase for persistence)
- Not production-hardened (no auth, rate limiting, or retry/backoff)
- Supabase RLS policies are permissive for development
- Maximum file size: 5MB per resume
- Add authentication and role-based access
- Improve scoring model and calibrate weights
- Add batch processing for large resume sets
- Add unit tests and CI/CD pipeline
- Support more document formats (TXT, RTF)
- Implement retry logic with exponential backoff
- Add candidate comparison view
- Enable custom evaluation criteria
This project is licensed under the MIT License - see the LICENSE file for details.
Contributions are welcome! Please feel free to submit a Pull Request.