Cache HTTPS traffic for web scraping. Cut bandwidth costs by 90%.
demo.mp4
If you're running Puppeteer, Playwright, or any web scraper, you're downloading the same JavaScript bundles, fonts, and images over and over again. Squache intercepts HTTPS traffic and caches everything aggressively reducing bandwidth by 90%+ for repeated crawls.
Use cases:
- Running hundreds of Puppeteer instances
- Scraping the same domains repeatedly
- Minimizing expensive residential proxy usage
- Building reproducible scrapers (same cached assets)
git clone https://github.com/devrupt-io/squache.git
cd squache
cp example.env .env
docker compose up -dThat's it. SSL certificates are generated automatically.
- Dashboard: http://localhost:3011
- Proxy: http://localhost:3128
First-time login: Check the logs for your auto-generated admin password:
docker compose logs backend | grep -A5 "SQUACHE ADMIN"Or set your own password in .env:
ADMIN_PASS=your-passwordSquache uses SSL bumping to cache HTTPS traffic. Download and install the CA certificate:
# Download the CA certificate
curl -o squache-ca.crt http://localhost:3010/api/config/ssl/certificate
# Linux
sudo cp squache-ca.crt /usr/local/share/ca-certificates/
sudo update-ca-certificates
# macOS
sudo security add-trusted-cert -d -r trustRoot -k /Library/Keychains/System.keychain squache-ca.crt
# Node.js (or add to your shell profile)
export NODE_EXTRA_CA_CERTS=$(pwd)/squache-ca.crtimport puppeteer from 'puppeteer';
const browser = await puppeteer.launch({
args: ['--proxy-server=http://localhost:3128'],
});
await browser.newPage().then(page => page.goto('https://example.com'));# curl (after installing CA cert)
curl -x http://localhost:3128 https://example.com
# wget
https_proxy=http://localhost:3128 wget https://example.com
# Node.js / Python
export HTTPS_PROXY=http://localhost:3128flowchart LR
subgraph Scraper["Your Scraper"]
A1["Puppeteer"]
A2["curl"]
A3["Python"]
end
subgraph Squache
Proxy["Squid Proxy<br/>(SSL Bump + Cache)"]
Dashboard["Dashboard<br/>(Next.js)"]
API["API<br/>(Express.js)"]
DB[(PostgreSQL)]
Dashboard --> API --> DB
API -.-> Proxy
end
Scraper --> Proxy --> Internet((Internet))
Key features:
- SSL Bumping - Decrypts HTTPS to cache responses (generates certs on-the-fly)
- Aggressive Caching - Forces caching of JS, CSS, images, fonts, and media
- Web Dashboard - Real-time bandwidth metrics and cache hit rates
- Zero Config - SSL certificates auto-generated on first run
The dashboard shows real-time statistics:
- Bandwidth saved vs. bandwidth used
- Cache hit/miss ratios
- Per-domain statistics
- Request logs with search
Access at http://localhost:3011 after starting the services.
All configuration is done through .env. Copy the example file and customize:
cp example.env .envKey settings you may want to change:
# Admin credentials (password auto-generated if empty)
ADMIN_EMAIL=admin@example.com
ADMIN_PASS=your-secure-password # Leave empty to auto-generate on each start
# Security (generate with: openssl rand -hex 32)
JWT_SECRET=your-random-secret
# If exposing to the internet (update all three to your domain)
BACKEND_URL=https://api.squache.yourdomain.com
FRONTEND_URL=https://squache.yourdomain.com
CORS_ORIGIN=https://squache.yourdomain.com
NEXT_PUBLIC_API_URL=https://api.squache.yourdomain.com
NEXT_PUBLIC_SITE_URL=https://squache.yourdomain.comMost endpoints require authentication via Bearer token. Public endpoints are marked below.
| Method | Endpoint | Description |
|---|---|---|
| POST | /api/auth/login |
Login, returns JWT |
| GET | /api/auth/me |
Current user info |
| Method | Endpoint | Description |
|---|---|---|
| GET | /api/stats |
Overall statistics |
| GET | /api/stats/bandwidth?range=1h |
Bandwidth over time |
| GET | /api/stats/domains |
Per-domain breakdown |
| Method | Endpoint | Description |
|---|---|---|
| GET | /api/cache |
Cache size and object count |
| DELETE | /api/cache |
Purge all cache (admin) |
| Method | Endpoint | Description |
|---|---|---|
| GET | /api/logs |
Recent access logs |
| GET | /api/logs/search?url=... |
Search logs |
| Method | Endpoint | Description |
|---|---|---|
| GET | /api/config/ssl/certificate |
Download CA cert (no auth) |
| Method | Endpoint | Description |
|---|---|---|
| GET | /api/upstreams |
List upstream proxies |
| POST | /api/upstreams |
Add upstream proxy |
| GET | /api/upstreams/providers |
List known providers |
| Method | Endpoint | Description |
|---|---|---|
| GET | /api/domains |
Domain statistics by primary domain |
| GET | /api/domains/search?q=... |
Search domains |
| Method | Endpoint | Description |
|---|---|---|
| GET | /health |
Service health (no auth) |
| Service | Port | Description |
|---|---|---|
| Proxy | 3128 | Point scrapers here |
| Backend | 3010 | REST API |
| Frontend | 3011 | Web dashboard |
Typical savings for web scraping workloads:
| Content Type | Cache Hit Rate |
|---|---|
| JavaScript/CSS | 95%+ |
| Images/Fonts | 90%+ |
| Video/Media | 85%+ |
| Repeated crawls | 90%+ overall |
# Clone
git clone https://github.com/devrupt-io/squache.git
cd squache
# Start in development mode
docker compose up -d
# Watch logs
docker compose logs -f
# Rebuild after changes
docker compose up -d --buildRoute requests through VPN or residential proxies based on HTTP headers:
await page.setExtraHTTPHeaders({
'X-Squache-Upstream': 'residential', // vpn | residential | direct
'X-Squache-Country': 'US', // ISO country code
});- Wire up Squid ACLs for header-based routing
- Provider integrations (Webshare, Bright Data, Oxylabs)
- Load balancing across multiple upstream proxies
- Automatic failover
Note: The infrastructure exists (database models, API endpoints, config generation in backend/src/routes/upstreams.ts), but Squid routing rules are not yet fully wired up.
squache/
├── proxy/ # Squid configuration
│ ├── squid.conf # Main config
│ ├── entrypoint.sh # Auto-init SSL db
│ └── conf.d/ # Dynamic config (managed by backend)
├── backend/ # Express.js API
│ └── src/
│ ├── routes/ # REST endpoints
│ ├── services/ # Squid config generator
│ └── models/ # Sequelize models
├── frontend/ # Next.js dashboard
└── docker-compose.yml # Self-contained setup
Domains tab and improved analytics.
- Fixed a bug when no upstreams were declared
- Simplified deployment using
.envfor configuration - Improved layout consistency for the UI and header
- Domains tab - New dashboard tab showing domain-level statistics
- Aggregates requests by primary domain (e.g.,
cdn.example.com→example.com) - Shows request count, bandwidth usage, cache hit rate, errors, and response times
- Expandable subdomain details for each primary domain
- Sortable columns and time range filters (1h, 6h, 24h, 7d)
- Search functionality to find specific domains
- Aggregates requests by primary domain (e.g.,
- Domains API - New REST endpoints for domain analytics
GET /api/domains- Domain statistics aggregated by primary domainGET /api/domains/search- Search domains by name
- Improved handling of CONNECT-style requests (e.g.,
example.com:443) - Better multi-part TLD support (e.g.,
.co.uk,.com.au)
Initial release.
- SSL bumping with auto-generated CA certificates
- Aggressive caching for static assets (JS, CSS, images, fonts, video)
- Web dashboard with real-time metrics
- PostgreSQL for statistics and configuration
- Docker Compose for easy self-hosting
- REST API for programmatic access
Pull requests welcome. For major changes, please open an issue first.
Squache is built on the shoulders of giants:
- Squid Cache - The battle-tested caching proxy that powers Squache. 25+ years of development, used by ISPs and enterprises worldwide. All the SSL bumping, caching algorithms, and proxy magic comes from Squid.
- Next.js - React framework powering the dashboard
- Express.js - Fast, unopinionated API server
- Sequelize - TypeScript ORM for PostgreSQL
- PostgreSQL - Rock-solid database
- Docker - Containerization for easy deployment
MIT License - see LICENSE for details.
Built by devrupt.io - We build practical tools for everyone.