Research: Reproducible benchmarks for batch-invariant LLM inference across models & GPUs (A10, A100, H100)
research cuda pytorch triton benchmarks gpu-kernels vllm llm-inference batch-invariance deterministic-inference
-
Updated
Sep 28, 2025 - Python