Resume Screening Agent

A single-page Streamlit app that helps recruiters evaluate and rank resumes against a Job Description (JD) using Google Gemini, LangChain, and ChromaDB.

Features

Upload Job Descriptions - Paste or type the JD for the position
Upload Multiple Resumes - Support for PDF and DOCX formats (max 5MB each)
AI-Powered Evaluation - Uses Google Gemini 2.0 Flash for intelligent scoring
Hybrid Scoring System - Combines LLM analysis (60%) with embedding similarity (40%)
Skills Analysis - Identifies matching and missing skills for each candidate
Match Score & Summary - 0-100 score with detailed fit summary
Export to CSV - Download evaluation results for further analysis
Session History - Clickable sidebar with recent evaluations
Persistent Storage - Optional Supabase integration for cross-session history
View & Delete Past Evaluations - Manage your evaluation history
Mobile Responsive - Works on desktop and mobile devices
Reset Functionality - Clear form and start fresh

Tech Stack

Component	Technology	Purpose
Frontend	Streamlit	Interactive web UI
LLM	Google Gemini 2.0 Flash	Resume evaluation & scoring
LLM Framework	LangChain	Prompt orchestration & output parsing
Embeddings	sentence-transformers (all-MiniLM-L6-v2)	Text vectorization
Vector Store	ChromaDB	In-memory similarity search
Database	Supabase (PostgreSQL)	Optional persistent storage
Document Parsing	PyPDF2, python-docx	Resume text extraction

Project Structure

resume-screening-agent/
├── app.py                      # Streamlit entrypoint
├── requirements.txt            # Python dependencies
├── README.md                   # Documentation
├── LICENSE                     # MIT License
└── src/
    ├── __init__.py
    ├── config.py               # Environment loader & settings
    ├── ai/
    │   ├── __init__.py
    │   ├── prompts.py          # LangChain prompt templates & Pydantic models
    │   ├── evaluator.py        # Core evaluation logic (embeddings + LLM)
    │   └── embeddings.py       # ChromaDB manager & embedding functions
    ├── db/
    │   ├── __init__.py
    │   ├── supabase_client.py  # Supabase client wrapper (optional)
    │   └── models.py           # Database model helpers (CRUD operations)
    └── ui/
        ├── __init__.py
        └── components.py       # Streamlit UI components & app logic

Setup

Prerequisites

Python 3.9+
Google API Key (Get one from Google AI Studio)

Installation

Clone the repository

git clone https://github.com/anuragparashar26/resume-screening-agent.git
cd resume-screening-agent

Create virtual environment and install dependencies

python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install -r requirements.txt

Configure Google API Key

For local development:
```
# Create secrets file
cp .streamlit/secrets.toml.example .streamlit/secrets.toml
# Edit .streamlit/secrets.toml and add your API key
```
For Streamlit Cloud deployment:
- Go to your app's dashboard → Settings → Secrets
- Add: GOOGLE_API_KEY = "your-google-api-key-here"
Run the application

streamlit run app.py

Optional: Persistent Storage with Supabase

If you want evaluation history to persist across sessions, set up your own Supabase:

Create a .env file in the project root:

SUPABASE_URL=https://your-project.supabase.co
SUPABASE_ANON_KEY=your-supabase-anon-key

Run this SQL in your Supabase SQL editor:

-- Create tables
CREATE TABLE IF NOT EXISTS evaluations (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  job_title TEXT,
  job_description TEXT NOT NULL,
  created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE IF NOT EXISTS evaluation_results (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  evaluation_id UUID REFERENCES evaluations(id) ON DELETE CASCADE,
  candidate_name TEXT,
  score NUMERIC,
  summary TEXT,
  matching_skills JSONB,
  missing_skills JSONB,
  created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Enable Row Level Security
ALTER TABLE evaluations ENABLE ROW LEVEL SECURITY;
ALTER TABLE evaluation_results ENABLE ROW LEVEL SECURITY;

-- Create policies for anonymous access
CREATE POLICY "Allow anonymous insert on evaluations" ON evaluations
  FOR INSERT TO anon WITH CHECK (true);
CREATE POLICY "Allow anonymous select on evaluations" ON evaluations
  FOR SELECT TO anon USING (true);
CREATE POLICY "Allow anonymous delete on evaluations" ON evaluations
  FOR DELETE TO anon USING (true);

CREATE POLICY "Allow anonymous insert on evaluation_results" ON evaluation_results
  FOR INSERT TO anon WITH CHECK (true);
CREATE POLICY "Allow anonymous select on evaluation_results" ON evaluation_results
  FOR SELECT TO anon USING (true);
CREATE POLICY "Allow anonymous delete on evaluation_results" ON evaluation_results
  FOR DELETE TO anon USING (true);

Usage Guide

Ensure Google API Key is configured (via Streamlit Cloud secrets or local secrets.toml)
Enter a Job Description and optional Job Title in the main form
Upload resumes - Drag and drop or browse for PDF/DOCX files (max 5MB each)
Click "Evaluate Resumes" to run AI-powered scoring
View ranked results with scores, summaries, and skill analysis
Download CSV for offline analysis or sharing
Click history items in the sidebar to view past evaluations
Use "Reset" to clear the form and start fresh

Storage Options

Mode	Description	Data Persistence
Session (default)	History stored in Streamlit session_state	Cleared on page refresh
Supabase (optional)	User provides their own Supabase credentials	Persists across sessions

How It Works

Evaluation Pipeline

Resume Upload → Text Extraction → Embedding Generation → Similarity + LLM Analysis → Final Score

Document Parsing: PyPDF2 (PDF) and python-docx (DOCX) extract text from uploaded resumes
Embedding Generation: sentence-transformers (all-MiniLM-L6-v2) converts text to 384-dimensional vectors
Similarity Computation: ChromaDB calculates cosine similarity between job description and resumes
LLM Evaluation: Google Gemini 2.0 Flash analyzes each resume against the JD using LangChain:
- Generates a score (0-100)
- Writes a fit summary (3-5 sentences)
- Identifies matching skills
- Identifies missing skills
Score Calculation: Final score = (0.6 × LLM Score) + (0.4 × Similarity × 100)
Ranking: Candidates sorted by final score in descending order

Why Hybrid Scoring?

Embedding similarity captures semantic relevance at scale
LLM analysis provides nuanced understanding of qualifications
Combined approach balances speed with accuracy

Limitations

LLM responses may occasionally be inaccurate; human review recommended
Session history clears on page refresh (use Supabase for persistence)
Not production-hardened (no auth, rate limiting, or retry/backoff)
Supabase RLS policies are permissive for development
Maximum file size: 5MB per resume

Potential Improvements

Add authentication and role-based access
Improve scoring model and calibrate weights
Add batch processing for large resume sets
Add unit tests and CI/CD pipeline
Support more document formats (TXT, RTF)
Implement retry logic with exponential backoff
Add candidate comparison view
Enable custom evaluation criteria

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Resume Screening Agent

Features

Tech Stack

Project Structure

Setup

Prerequisites

Installation

Optional: Persistent Storage with Supabase

Usage Guide

Storage Options

How It Works

Evaluation Pipeline

Why Hybrid Scoring?

Limitations

Potential Improvements

License

Contributing

About

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.streamlit		.streamlit
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

License

anuragparashar26/skillscreen

Folders and files

Latest commit

History

Repository files navigation

Resume Screening Agent

Features

Tech Stack

Project Structure

Setup

Prerequisites

Installation

Optional: Persistent Storage with Supabase

Usage Guide

Storage Options

How It Works

Evaluation Pipeline

Why Hybrid Scoring?

Limitations

Potential Improvements

License

Contributing

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages