-
Notifications
You must be signed in to change notification settings - Fork 167
Pull requests: NVIDIA/TensorRT-Model-Optimizer
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
EAGLE parallel draft with auto regression; kv cache in EAGLE training
#391
opened Sep 29, 2025 by
yeyu-nvidia
Loading…
[5545101]: AutoCast: Add options to force include node/op in F16
#386
opened Sep 28, 2025 by
galagam
Loading…
Support kv cache quantization for mcore using bmm_quantizers
#375
opened Sep 25, 2025 by
kaix-nv
Loading…
[New feature] Support fakequant serve with latest vllm and generalize calib logic
#369
opened Sep 25, 2025 by
RalphMao
Loading…
Sync amax & AWQ-Lite act_scale in context parallel/data parallel [OMNIML-2813]
#359
opened Sep 24, 2025 by
jenchen13
Loading…
1 task
Refactor set_multi_step_attn_mask for arbitrary step
#348
opened Sep 19, 2025 by
yeyu-nvidia
Loading…
[Autocast] Fix edge case casting input directly to output
#305
opened Sep 9, 2025 by
aboubezari
Loading…
Fix speculative decoding example
stale
Not updated in a long time
#214
opened Jun 13, 2025 by
Framartin
Loading…
Bump the pip group across 3 directories with 1 update
#205
opened Jun 5, 2025 by
dependabot
bot
Loading…
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.