dlcalc is a collection of tools for deep learning practitioners, providing calculators and tools for:
- 🧮 Performance Modeling - Estimate training throughput, memory usage, and MFU
- 🌐 Topology Analysis - Analyze and optimize network topology for distributed training
- 📊 Metrics Conversion - Convert between different performance metrics
- 🔍 Checkpoint Analysis - Inspect and summarize model checkpoints
pip install dlcalcor
git clone https://github.com/jfc4050/dlcalc
cd dlcalc
pip install -e .After this you should have access to the command line tools described below. Some
people may need to add --user to their pip install command for them to properly
go under $PATH.
Calculator for estimating performance characteristics of ND parallel transformer training:
3dtrn examples/llama3_70b.yamlWe recommend to use this with profilers like NVIDIA Nsight Systems or PyTorch Profiler to give theoretical grounding to your performance profiling.
| Tool | Command | Purpose |
|---|---|---|
| Visualizer | topoviz |
Generate network topology graphs from Kubernetes clusters |
| Evaluator | topoeval |
Analyze topology optimality for DP rings |
| Scheduler | topoassign |
Compute topology-aware rank assignments |
# Visualize cluster topology
topoviz -h
# Evaluate training job topology
topoeval -h
# Generate optimal rank assignments
topoassign -hConvert training throughput to Model FLOPs Utilization (MFU):
sps2mfu --samples-per-sec 100 --seqlen 2048 --model-size 70b \
--n-accelerators 512 --tflops-per-accelerator 312Calculate daily token throughput:
sps2tpd --samples-per-sec 100 --seqlen 2048Analyze PyTorch checkpoint contents:
ckpt-summarize model.pt# Install with development dependencies
pip install -e .[dev]
# Install pre-commit hooks
pre-commit install# Run all checks (formatting, linting, type checking, tests)
bash checks# Run full test suite
pytest tests/ -v
# Run with coverage
pytest tests/ --cov=dlcalc --cov-report=term-missingContributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.