Thanks for your interest in contributing! TurboQuant is the first open-source implementation of the TurboQuant paper (ICLR 2026).
git clone https://github.com/OnlyTerp/turboquant.git
cd turboquant
pip install -e ".[dev]"
pytest src/test_turboquant.py -v- Benchmarking on more models — We've validated on Mistral-7B and Nemotron-Nano-4B. More models = better.
- Triton kernel correctness —
kernels.pyis experimental and uses Rademacher S matrices (see IMPLEMENTATION_NOTES.md). Needs validation. - vLLM integration — The plugin scaffold in
vllm_plugin/needs real-world testing. - Performance optimization — The pure PyTorch path is correct but slow. GPU acceleration welcome.
- More bit-width configurations — The paper shows results at 2.5-bit and 3.5-bit. We support both but need more testing.
- Python 3.10+
- Type hints on public API functions
- Docstrings on all public classes and functions
pytestfor tests — runpytest src/test_turboquant.py -vbefore submitting
- Fork the repo and create a feature branch
- Add tests for new functionality
- Ensure all tests pass:
pytest src/test_turboquant.py -v - Update documentation if needed
- Open a PR with a clear description of what changed and why
Please include:
- Python and PyTorch versions
- GPU model (if applicable)
- Minimal reproduction code
- Full error traceback
By contributing, you agree that your contributions will be licensed under the MIT License.