Skip to content

NeuraLiying/awesome-speculative-decoding

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 

Repository files navigation

Awesome Speculative Decoding

A curated list of speculative decoding papers, updated continuously.


Training-based Methods

Better & Faster Large Language Models Via Multi-Token Prediction [ICML 2024]

  • Authors: Fabian Gloeckle, Badr Youbi Idrissi, Baptiste Rozière, David Lopez-Paz, Gabriel Synnaeve
  • Year: 2024
  • arXiv: arxiv.org/abs/2404.19737
  • GitHub: N/A

MEDUSA: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads [ICML 2024]

  • Authors: Tianle Cai, Yuhong Li, Zhengyang Geng, Hongwu Peng, Jason D. Lee, Deming Chen, Tri Dao
  • Year: 2024
  • arXiv: arxiv.org/pdf/2401.10774
  • GitHub: N/A

EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty [ICML 2024]

EAGLE-2: Faster Inference of Language Models with Dynamic Draft Trees [EMNLP 2024]

EAGLE-3: Scaling up Inference Acceleration of Large Language Models via Training-Time Test


Training-Free Methods

Accelerating Auto-Regressive Text-To-Image Generation With Training-Free Speculative Jacobi Decoding [ICLR 2025]

Break the Sequential Dependency of LLM Inference Using LOOKAHEAD DECODING [ICML 2024]


Hybrid & Compositional Methods

LayerSkip: Enabling Early Exit Inference And Self-Speculative Decoding [ACL 2024]

About

A curated list of speculative decoding papers, code, and resources for efficient LLM/MLLM inference.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors