Squid Deployment for Vulcan HPC Cluster

Maintained by: Karim Ali (kali2@ualberta.ca)

🧰 Description

DaemonSet-based Squid proxy deployment for the Vulcan HPC cluster. Each Kubernetes node runs a Squid instance with host networking, providing stable IP addresses and peer cache coordination for CVMFS (CernVM File System).

This setup deploys Squid across all 5 Kubernetes nodes (172.26.92.2-6), where each node peers with the others to share cached content. CVMFS clients can use any of these node IPs for maximum availability and cache efficiency.

🚀 Quick Start

1. Setup host directories

Run on each Kubernetes node. The cache directory is typically mounted on its own partition for optimal performance:

sudo ./host-setup.sh

This creates the Squid user (UID/GID 3128:3128) and cache directory structure at /var/lib/squid.

2. Deploy to Kubernetes

kubectl apply -f squid-namespace.yaml
kubectl apply -f squid_entrypoint.yaml
kubectl apply -f squid_cache-config.yaml
kubectl apply -f squid-whitelist.yaml
kubectl apply -f squid.yaml

3. Verify deployment

kubectl get pods -n squid-standard
kubectl logs -n squid-standard -l app=squid

📦 What's Inside

Squid 5.x (from ubuntu/squid:latest)
100GB disk cache per node on host-mounted volume (aufs format, 32 dirs, 512 subdirs)
24GB RAM cache for hot objects and metadata
4-worker process pool for high concurrency
Peer clustering across all 5 Kubernetes nodes (172.26.92.2-6) with automatic cache sharing as "siblings"
CVMFS-optimized refresh patterns for long-term caching (1440 min TTL with 90% freshness)
Domain whitelisting for research networks (CERN, universities, scientific repos, PyPI, Hugging Face, etc.)
Security hardening with read-only rootfs, non-root user (UID 3128), minimal capabilities (NET_BIND_SERVICE, SYS_RESOURCE)

🏗️ Components

squid.yaml - DaemonSet deployment configuration
squid_entrypoint.yaml - Main Squid config and peer discovery
squid_cache-config.yaml - Cache optimization settings
squid-whitelist.yaml - Domain whitelist for research networks
squid-namespace.yaml - Kubernetes namespace
host-setup.sh - Host directory setup script

🏗️ Architecture

This deployment runs one Squid pod per Kubernetes node with host networking enabled. This provides:

Stable IP addresses: Each node has a consistent IP (e.g., 172.26.92.2-6)
Peer cache clustering: Squid instances communicate as "siblings" to share cached content
High availability: CVMFS clients can connect to any of the 5 nodes
Host-mounted cache: Cache data persists on the host filesystem (typically a dedicated partition)

When a CVMFS client requests data:

If the local node has it cached, it's served immediately
If not cached locally, the node queries peers for cached content
If peers have it, it's fetched from them (cache hierarchy)
Otherwise, the content is fetched from the upstream stratum server and cached

⚙️ Configuration

Peer Discovery

Peering is configured for the Vulcan cluster nodes (172.26.92.2-6). Each Squid instance automatically discovers and connects to peers on other nodes at startup.

To modify the peer list, edit the PEERS variable in squid_entrypoint.yaml:

PEERS="172.26.92.2 172.26.92.3 172.26.92.4 172.26.92.5 172.26.92.6"

Resource Limits

Default configuration:

Memory: 8-32GB request/limit per pod
CPU: 1-4 cores request/limit per pod
Cache: 100GB disk cache (host-mounted), 24GB RAM cache

📖 Usage

Configure CVMFS Clients

Provide all 5 node IPs to CVMFS for redundancy and load balancing:

export CVMFS_SQUID_LIST="http://172.26.92.2:3128|http://172.26.92.3:3128|http://172.26.92.4:3128|http://172.26.92.5:3128|http://172.26.92.6:3128"

Or configure in /etc/cvmfs/default.local:

CVMFS_SQUID_LIST="http://172.26.92.2:3128|http://172.26.92.3:3128|http://172.26.92.4:3128|http://172.26.92.5:3128|http://172.26.92.6:3128"

Testing

Test proxy connectivity:

curl -v --proxy http://172.26.92.2:3128 http://www.google.com

Test CVMFS stratum access:

curl --proxy http://172.26.92.2:3128 http://cvmfs-stratum-one.cern.ch/cvmfs/

🤝 Support

Many Bothans died to bring us this information. This project is provided as-is, but reasonable questions may be answered based on my coffee intake or mood. ;)

Feel free to open an issue or email kali2@ualberta.ca for U of A related deployments.

📜 License

This project is released under the MIT License - one of the most permissive open-source licenses available.

What this means:

✅ Use it for anything (personal, commercial, whatever)
✅ Modify it however you want
✅ Distribute it freely
✅ Include it in proprietary software

The only requirement: Keep the copyright notice somewhere in your project.

That's it! No other strings attached. The MIT License is trusted by major projects worldwide and removes virtually all legal barriers to using this code.

Full license text: MIT License

🧠 About University of Alberta Research Computing

The Research Computing Group supports high-performance computing, data-intensive research, and advanced infrastructure for researchers at the University of Alberta and across Canada.

We help design and operate compute environments that power innovation — from AI training clusters to national research infrastructure.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Squid Deployment for Vulcan HPC Cluster

🧰 Description

🚀 Quick Start

1. Setup host directories

2. Deploy to Kubernetes

3. Verify deployment

📦 What's Inside

🏗️ Components

🏗️ Architecture

⚙️ Configuration

Peer Discovery

Resource Limits

📖 Usage

Configure CVMFS Clients

Testing

🤝 Support

📜 License

🧠 About University of Alberta Research Computing

About

Uh oh!

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
LICENSE		LICENSE
README.md		README.md
host-setup.sh		host-setup.sh
squid-namespace.yaml		squid-namespace.yaml
squid-whitelist.yaml		squid-whitelist.yaml
squid.yaml		squid.yaml
squid_cache-config.yaml		squid_cache-config.yaml
squid_entrypoint.yaml		squid_entrypoint.yaml

License

ualberta-rcg/vulcan-squid

Folders and files

Latest commit

History

Repository files navigation

Squid Deployment for Vulcan HPC Cluster

🧰 Description

🚀 Quick Start

1. Setup host directories

2. Deploy to Kubernetes

3. Verify deployment

📦 What's Inside

🏗️ Components

🏗️ Architecture

⚙️ Configuration

Peer Discovery

Resource Limits

📖 Usage

Configure CVMFS Clients

Testing

🤝 Support

📜 License

🧠 About University of Alberta Research Computing

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors 2

Uh oh!

Languages