Skip to content

Latest commit

 

History

History
89 lines (67 loc) · 1.81 KB

index.rst

File metadata and controls

89 lines (67 loc) · 1.81 KB

Welcome to TensorRT-LLM's documentation!

.. toctree::
   :maxdepth: 1
   :caption: Contents:

   architecture.md
   gpt_runtime.md
   batch_manager.md
   inference_request.md
   gpt_attention.md
   precision.md
   build_from_source.md
   performance.md
   2023-05-19-how-to-debug.md
   2023-05-17-how-to-add-a-new-model.md
   graph-rewriting.md
   memory.md
   workflow.md
   checkpoint.md
   lora.md
   perf_best_practices.md
   performance_analysis.md


Python API

.. toctree::
   :maxdepth: 2
   :caption: Python API
   :hidden:

   python-api/tensorrt_llm.layers
   python-api/tensorrt_llm.functional
   python-api/tensorrt_llm.models
   python-api/tensorrt_llm.plugin
   python-api/tensorrt_llm.quantization
   python-api/tensorrt_llm.runtime


C++ API

.. toctree::
   :maxdepth: 2
   :caption: C++ API
   :hidden:

   _cpp_gen/runtime


Indices and tables

Blogs

.. toctree::
   :maxdepth: 2
   :caption: Blogs
   :hidden:

   blogs/H100vsA100.md
   blogs/H200launch.md
   blogs/Falcon180B-H200.md
   blogs/quantization-in-TRT-LLM.md