-
Notifications
You must be signed in to change notification settings - Fork 314
Pull requests: NVIDIA/Model-Optimizer
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Add bypass distillation (blockwise local KD) to puzzletron pipeline
#1111
opened Mar 24, 2026 by
Separius
Loading…
Add MBridge pruning distillation CICD test + min transformers bumped to 4.56
#1109
opened Mar 24, 2026 by
kevalmorabia97
Loading…
[OMNIML-3776]: add clear docs restrict the model types
#1105
opened Mar 23, 2026 by
shengliangxu
Loading…
fix: EAGLE mix_hidden_states in-place op crash (#1088)
#1104
opened Mar 23, 2026 by
javierdejesusda
Loading…
6 tasks done
Added general graph surgery run function for easier scalability with Olive.
#1096
opened Mar 23, 2026 by
hthadicherla
Loading…
[OMNIML-3689] PTQ quant_cfg semantic correction. Design in doc _quant_cfg.rst
#1094
opened Mar 22, 2026 by
shengliangxu
Loading…
add: HF PTQ support and modelopt_recipes mount in launcher
#1089
opened Mar 20, 2026 by
ChenhanYu
Loading…
Exclude small-channel Conv nodes from FP8 quantization
#1083
opened Mar 20, 2026 by
nv-samcheng
Loading…
[3/n] Add skip-softmax to Triton flash attention kernel
#1081
opened Mar 20, 2026 by
kaix-nv
Loading…
fix: [modelopt 0.43.0][GB200][llm_ptq / sglang] Llama-3.1-8B-Inst (#5997673)
#1080
opened Mar 20, 2026 by
ChenhanYu
Loading…
[2/n] Add sparse softmax to the Triton flash attention kernel
#1078
opened Mar 19, 2026 by
kaix-nv
Loading…
fix: [ModelOpt-Windows][modelopt 0.43.0] [genai_llm][README]: Sho (#5997787)
#1077
opened Mar 19, 2026 by
ChenhanYu
Loading…
fix: Feature: Add validation for loaded modelopt state files (#1041)
#1074
opened Mar 19, 2026 by
ChenhanYu
Loading…
[minor] Refactor TE fused-norm handling in GPTModelExporter
#1061
opened Mar 17, 2026 by
yueshen2016
Loading…
Add LoRA co-training support for HF EAGLE speculative decoding
#1060
opened Mar 17, 2026 by
yeyu-nvidia
Loading…
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.