Tabrizian

Iman Tabrizian Tabrizian

Achievements

NVIDIA/TensorRT-LLM NVIDIA/TensorRT-LLM Public

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…

C++ 10.4k 1.4k
triton-inference-server/server triton-inference-server/server Public

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

Python 9.2k 1.6k
triton-inference-server/python_backend triton-inference-server/python_backend Public

Triton backend that enables pre-process, post-processing and other logic to be implemented in Python.

C++ 606 166
triton-inference-server/model_analyzer triton-inference-server/model_analyzer Public

Triton Model Analyzer is a CLI tool to help with better understanding of the compute and memory requirements of the Triton Inference Server models.

Python 474 78
learning-to-quantize learning-to-quantize Public

Code for "Adaptive Gradient Quantization for Data-Parallel SGD", published in NeurIPS 2020.

Jupyter Notebook 30 5