YJHMITWEB

Jinghan Yao YJHMITWEB

Achievements

Flover Flover Public

Temporal Fusion Framework for Efficient Autoregressive Model Parallel Inference

C++ 9
NVIDIA/TensorRT-LLM NVIDIA/TensorRT-LLM Public

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 9k 1k
NVIDIA/nccl NVIDIA/nccl Public

Optimized primitives for collective multi-GPU communication

C++ 3.3k 838
fudan-zvg/SOFT fudan-zvg/SOFT Public

[NeurIPS 2021 Spotlight] & [IJCV 2024] SOFT: Softmax-free Transformer with Linear Complexity

Python 308 25
microsoft/DeepSpeed microsoft/DeepSpeed Public

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 36k 4.2k