AI-powered job application tracker with natural language input, RAG chat, and rich analytics.
JobHound lets you track job applications by pasting in free-form text ("Applied to Acme Corp for Senior Engineer, remote, €80-100k, found on LinkedIn") and have the AI parse it into structured data. Chat with your application history using RAG, and visualize your job search with a comprehensive analytics dashboard.
graph TD
subgraph Frontend["Frontend (Next.js 14)"]
UI[React UI]
Auth[next-auth]
Charts[Recharts]
end
subgraph Backend["Backend (FastAPI)"]
API[REST API]
Parser[NL Parser]
RAG[RAG Pipeline]
Analytics[Analytics Service]
LLM[LLM Adapters]
Embed[Embedding Adapters]
end
subgraph DB["Database (PostgreSQL 16 + pgvector)"]
Relational[(Relational Data)]
Vectors[(Vector Embeddings)]
end
subgraph LLMProviders["LLM Providers"]
Ollama[Ollama local]
OpenAI[OpenAI]
Anthropic[Anthropic]
Nebius[Nebius]
end
UI --> API
Auth --> API
API --> Parser
API --> RAG
API --> Analytics
Parser --> LLM
RAG --> LLM
RAG --> Embed
LLM --> LLMProviders
API --> Relational
Embed --> Vectors
RAG --> Vectors
| Layer | Technology | Reasoning |
|---|---|---|
| Frontend | Next.js 14 (App Router) | Server components, streaming, excellent DX |
| UI | shadcn/ui + Tailwind CSS | Accessible, unstyled primitives with rapid customization |
| Charts | Recharts | React-native, composable, good defaults |
| Auth | next-auth | Handles OAuth complexity, great Next.js integration |
| Backend | FastAPI | Async-first, auto-docs, Python type safety |
| ORM | SQLAlchemy 2.0 (async) | Mature, powerful, async support |
| Migrations | Alembic | Industry standard for SQLAlchemy |
| Database | PostgreSQL 16 + pgvector | Single DB for relational + vector, production-grade |
| LLM | Pluggable (Ollama/OpenAI/Anthropic/Nebius) | No vendor lock-in, local-first |
- Docker & Docker Compose
- Node.js 20+ (for local frontend dev)
- Python 3.12+ (for local backend dev)
- Ollama (optional, for local LLM)
# 1. Clone and configure
git clone https://github.com/sumdher/jobhound
cd jobhound
# 2. Set up environment files
cp backend/.env.example backend/.env
cp frontend/.env.example frontend/.env.local
# Edit both .env files with your values
# Minimum required: GOOGLE_CLIENT_ID, GOOGLE_CLIENT_SECRET, JWT_SECRET, NEXTAUTH_SECRET
# 3. Start everything
docker compose up
# Frontend: http://localhost:3000
# Backend API: http://localhost:8000
# API Docs: http://localhost:8000/docsUse docker-compose.prod.yml for a production-oriented stack. It keeps the existing development setup in docker-compose.yml untouched while switching frontend and backend builds to their production Docker targets, removing source bind mounts, and wiring in a Cloudflare Tunnel sidecar that authenticates with a tunnel token from the root env file.
# 1. Create runtime env files
cp backend/.env.example backend/.env
cp frontend/.env.example frontend/.env
# 2. Update production values
# - set strong secrets
# - set APP_URL to your public HTTPS URL
# - set NEXTAUTH_URL to your public HTTPS URL
# - set provider credentials/API keys
# 3. Configure the Cloudflare Tunnel
cp deploy/cloudflared/config.example.yml deploy/cloudflared/config.yml
# edit deploy/cloudflared/config.yml
# add CLOUDFLARE_TUNNEL_TOKEN=<your-tunnel-token> to the root .env
#
# If you keep the real config file outside the repo, set this in the root `.env`
# instead of copying it into `deploy/cloudflared/`:
# CLOUDFLARED_CONFIG_PATH=/absolute/path/to/config.yml
# 4. Start the production stack
docker compose -f docker-compose.prod.yml up -d --buildProduction characteristics:
- frontend and backend use the existing Docker
productiontargets - no source-code bind mounts or dev-only Next.js cache mounts
- PostgreSQL data stays in a named volume
- backend runs database migrations before starting
- frontend, backend, and database all have container healthchecks
- Cloudflare Tunnel uses a pinned image version, local config file, and token-based authentication from the root
.env
The production database runs in the db service in docker-compose.prod.yml. This repository now includes a host-side backup script at deploy/backup/jobhound-db-backup.sh plus example systemd unit files at deploy/backup/jobhound-db-backup.service and deploy/backup/jobhound-db-backup.timer.
Recommended behavior:
- the host runs the script once per day via systemd timer
- the script connects to the running PostgreSQL container with
pg_dump - backups are written to the host path
/var/backups/jobhound - each backup is stored as a compressed PostgreSQL custom-format archive (
*.dump.gz) - a matching SHA-256 checksum file is written next to each backup
- retention defaults to 14 days and old backups are pruned automatically
- the script refuses to run if the DB container is missing, stopped, unhealthy, or another backup job is already active
deploy/backup/jobhound-db-backup.sh: host-executed backup scriptdeploy/backup/jobhound-db-backup.service: example oneshot systemd servicedeploy/backup/jobhound-db-backup.timer: example daily systemd timerdeploy/backup/install-jobhound-db-backup.sh: one-shot host installer for the systemd setup
deploy/backup/jobhound-db-backup.sh is designed for unattended execution:
- uses
set -Eeuo pipefailandumask 077 - uses
flockto prevent overlapping runs - checks that Docker and other required host tools exist before starting
- confirms the Compose DB service is running and not unhealthy before dumping
- runs
pg_dumpinside the DB container so it uses the same database instance and credentials already configured for production - writes to a temporary file in the destination directory, validates the archive with
pg_restore --listinside the DB container, writes a checksum, then atomically renames the files into place - deletes only old backup files matching the JobHound backup naming pattern
The script defaults to:
- backup directory:
/var/backups/jobhound - retention:
14days - Compose file:
docker-compose.prod.yml - Compose project directory: repository root
- database service name:
db
These can be overridden with environment variables in the systemd service if your deployment path differs:
JOBHOUND_BACKUP_DIR=/var/backups/jobhound
JOBHOUND_BACKUP_RETENTION_DAYS=14
JOBHOUND_BACKUP_COMPOSE_FILE=/opt/jobhound/docker-compose.prod.yml
JOBHOUND_BACKUP_PROJECT_DIR=/opt/jobhound
JOBHOUND_BACKUP_SERVICE_NAME=dbUse the one-shot installer from the deployed repository checkout:
sudo ./deploy/backup/install-jobhound-db-backup.shdeploy/backup/install-jobhound-db-backup.sh is intended to be run directly on the Linux host that already runs the production Docker Compose stack. The installer is fail-fast and safe to rerun. On each run it:
- determines the repository root from its own location
- ensures
deploy/backup/jobhound-db-backup.shis executable - ensures
/var/backups/jobhoundexists with restricted permissions - installs
deploy/backup/jobhound-db-backup.serviceanddeploy/backup/jobhound-db-backup.timerinto/etc/systemd/system/ - writes a systemd drop-in at
/etc/systemd/system/jobhound-db-backup.service.d/override.confthat pointsWorkingDirectoryandExecStartat the actual repository path on that host - pins
JOBHOUND_BACKUP_COMPOSE_FILEto the real host path fordocker-compose.prod.yml - reloads systemd, enables the timer, and starts or restarts it so the active schedule matches the installed files
- prints verification commands for timer status, installed unit contents, logs, and an immediate manual test run
The timer is configured to run daily at 03:15 local time with up to 15 minutes of randomized delay. Persistent=true means a missed run will be started after boot if the machine was off at the scheduled time.
If the repository is moved later, rerun deploy/backup/install-jobhound-db-backup.sh from the new checkout location so the generated override points systemd at the new path.
- default retention is 14 days
- set
JOBHOUND_BACKUP_RETENTION_DAYSin the systemd service to change it - set it to
0to disable automatic pruning - pruning only removes files named like
jobhound-db-*.dump.gzandjobhound-db-*.dump.gz.sha256inside/var/backups/jobhound
- Pick the backup file to restore, for example
/var/backups/jobhound/jobhound-db-host-jobhound-20260411T031500Z.dump.gz - Optionally verify the checksum:
cd /var/backups/jobhound
sha256sum -c jobhound-db-host-jobhound-20260411T031500Z.dump.gz.sha256- Restore into the running DB container:
gzip -dc /var/backups/jobhound/jobhound-db-host-jobhound-20260411T031500Z.dump.gz \
| docker exec -i jobhound_db_prod sh -ceu 'export PGPASSWORD="$POSTGRES_PASSWORD"; pg_restore --clean --if-exists --no-owner --no-privileges --username "$POSTGRES_USER" --dbname "$POSTGRES_DB"'Restore notes:
pg_restore --clean --if-existsdrops existing objects before recreating them- restore into the correct production DB container for your deployment
- consider stopping the application stack or switching to maintenance mode first if you need a fully controlled restore window
- always test restores in a non-production environment before relying on the process operationally
- this is a logical PostgreSQL backup, not a filesystem-level volume snapshot
- large databases may make the backup window longer because the archive is streamed and compressed on the host
- backups are stored on the same host unless you separately replicate
/var/backups/jobhoundto off-host storage deploy/backup/install-jobhound-db-backup.shwrites a systemd override tied to the current repository path on the host, so rerun it if the checkout is moved- the host must have Docker CLI access and basic Unix tools used by the script (
bash,gzip,sha256sum,flock,find,install)
For production, the tunnel now requires:
- a root
.env.examplevalue forCLOUDFLARE_TUNNEL_TOKEN - a real local
deploy/cloudflared/config.ymlfile, unless you pointdocker-compose.prod.ymlat another config path withCLOUDFLARED_CONFIG_PATH
Unlike the previous setup, production no longer requires a live credentials.json bind mount inside the repository.
By default docker-compose.prod.yml looks for the config file at deploy/cloudflared/config.yml. If your local secrets workflow stores that file somewhere else, point Compose at it with CLOUDFLARED_CONFIG_PATH in the root .env.
The production tunnel container reads its authentication token from CLOUDFLARE_TUNNEL_TOKEN in the root .env and starts cloudflared with a token-based tunnel run command. Keep that token out of git and out of the repository tree.
For production, set at least these values before startup:
- root
.env.example:CLOUDFLARE_TUNNEL_TOKENand optionalCLOUDFLARED_CONFIG_PATH backend/.env.example:DATABASE_URL,JWT_SECRET,APP_URL, OAuth credentials, any LLM/embedder provider secretsfrontend/.env.example:NEXTAUTH_SECRET,NEXTAUTH_URL, OAuth credentials
| Variable | Required | Default | Description |
|---|---|---|---|
DATABASE_URL |
Yes | postgresql+asyncpg://... |
PostgreSQL connection string |
GOOGLE_CLIENT_ID |
Yes | — | Google OAuth client ID |
GOOGLE_CLIENT_SECRET |
Yes | — | Google OAuth client secret |
JWT_SECRET |
Yes | change-me |
Secret for signing JWTs |
LLM_PROVIDER |
No | ollama |
LLM provider: ollama, openai, anthropic, nebius |
OLLAMA_URL |
No | http://host.docker.internal:11434 |
Ollama base URL |
OLLAMA_MODEL |
No | gemma4:e4b |
Ollama model name |
OPENAI_API_KEY |
If using OpenAI | — | OpenAI API key |
ANTHROPIC_API_KEY |
If using Anthropic | — | Anthropic API key |
NEBIUS_API_KEY |
If using Nebius | — | Nebius API key |
EMBEDDING_PROVIDER |
No | ollama |
Embedding provider: ollama, openai |
EMBEDDING_MODEL |
No | nomic-embed-text |
Embedding model name |
EMBEDDING_DIMENSION |
No | 1536 |
Embedding vector dimensions |
| Variable | Required | Default | Description |
|---|---|---|---|
NEXT_PUBLIC_API_URL |
Yes | http://localhost:8000 |
Backend API URL |
NEXTAUTH_URL |
Yes | http://localhost:3000 |
Frontend base URL |
NEXTAUTH_SECRET |
Yes | — | NextAuth secret |
GOOGLE_CLIENT_ID |
Yes | — | Google OAuth client ID |
GOOGLE_CLIENT_SECRET |
Yes | — | Google OAuth client secret |
Switching providers requires only 2 environment variable changes:
# Use OpenAI
LLM_PROVIDER=openai
OPENAI_MODEL=gpt-4o-mini
OPENAI_API_KEY=sk-...
# Use Anthropic
LLM_PROVIDER=anthropic
ANTHROPIC_MODEL=claude-sonnet-4-20250514
ANTHROPIC_API_KEY=sk-ant-...
# Use local Ollama
LLM_PROVIDER=ollama
OLLAMA_MODEL=gemma4:e4bInteractive API docs available at http://localhost:8000/docs (Swagger UI) and http://localhost:8000/redoc (ReDoc).
# Backend only
cd backend
pip install -e ".[dev]"
uvicorn app.main:app --reload
# Frontend only
cd frontend
npm install
npm run dev
# Run backend tests
cd backend
pytest
# Run linting
cd backend && ruff check . && mypy .
cd frontend && npm run lint- Cloud deployment (AWS ECS / Railway)
- Email notifications for application status changes
- Calendar integration (interview scheduling)
- Browser extension for one-click capture from LinkedIn/Indeed
- Export to CSV/Excel
- Resume/CV storage and matching
- Recruiter contact tracking
- Salary benchmarking via market data APIs