Skip to content

NVIDIA/srt-slurm

Repository files navigation

srtctl

Command-line tool for distributed LLM inference benchmarks on SLURM clusters using TensorRT LLM, SGLang and vLLM. Replace complex shell scripts and 50+ CLI flags with declarative YAML configuration.

Quick Start

# Clone and install
git clone https://github.com/your-org/srtctl.git
cd srtctl
pip install -e .

# One-time setup (downloads NATS/ETCD, creates srtslurm.yaml)
make setup ARCH=aarch64  # or ARCH=x86_64

Documentation

Full documentation: https://srtctl.gitbook.io/srtctl-docs/

Commands

# Submit job(s)
srtctl apply -f config.yaml

# Submit with custom setup script
srtctl apply -f config.yaml --setup-script custom-setup.sh

# Submit with tags for filtering
srtctl apply -f config.yaml --tags experiment,baseline

# Dry-run (validate without submitting)
srtctl dry-run -f config.yaml

# Launch analysis dashboard
uv run streamlit run analysis/dashboard/app.py

About

NVIDIA Inference Benchmarks provide recipes in ready-to-use templates for evaluating platform speed. Validate your platform across specific AI use cases across hardware and software combinations.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages