-
Notifications
You must be signed in to change notification settings - Fork 196
Pull requests: SemiAnalysisAI/InferenceX
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[NV]Update Kimi K2.5 NVFP4 GB200 disaggregated TRT-LLM benchmarks via Dynamo
full-sweep-enabled
#1797
opened Jun 16, 2026 by
xinli-sw
Collaborator
Loading…
[NV]Add Kimi K2.5 NVFP4 GB300 disaggregated TRT-LLM benchmarks via Dynamo
#1796
opened Jun 16, 2026 by
xinli-sw
Collaborator
Loading…
[AMD] perf-changelog: duplicate dsv4-fp4-mi355x-sglang TP4 fixed-seq-len entry
full-sweep-enabled
#1795
opened Jun 16, 2026 by
Oseltamivir
Collaborator
Loading…
1 task
chore(runners): add TensorWave MI300X docker runners (mi300x-tw)
#1793
opened Jun 16, 2026 by
cquil11
Collaborator
Loading…
[WIP][NV]dsr1-fp4-b200-sglang: add DPA PDL lane
full-sweep-enabled
#1792
opened Jun 15, 2026 by
hshrivastava-droid
Collaborator
Loading…
[DO NOT MERGE] Run-only: gb200 dsr1 measured power+temp (canonical NVIDIA)
sweep-enabled
#1791
opened Jun 15, 2026 by
arygupt
Collaborator
Loading…
[NV] Add MiniMax-M3 FP8 B300 Dynamo vLLM recipes
full-sweep-enabled
#1788
opened Jun 15, 2026 by
Oseltamivir
Collaborator
Loading…
[WIP][NV] Add MiniMax M3 B300 Dynamo vLLM recipes
#1787
opened Jun 15, 2026 by
jasonlizhengjian
Collaborator
•
Draft
[AMD] perf: enable FlyDSL w4a16 MoE for Kimi INT4
full-sweep-fail-fast
#1785
opened Jun 15, 2026 by
amd-asalykov
Collaborator
Loading…
[NV] Update MiniMax M3 B200/B300 MTP settings
full-sweep-enabled
#1784
opened Jun 15, 2026 by
jasonlizhengjian
Collaborator
Loading…
perf(vllm): fuse MiniMax M3 BF16 EP experts on MI300X
#1782
opened Jun 15, 2026 by
Oseltamivir
Collaborator
•
Draft
[NV] Update MiniMax M3 B300 vLLM serving settings
non-canary-full-sweep-enabled
Run the full sweep without the canary gate (full search space, no trim)
#1781
opened Jun 15, 2026 by
jasonlizhengjian
Collaborator
Loading…
[WIP][NV] add glm5-fp4-gb200-dynamo-sglang
full-sweep-enabled
#1780
opened Jun 15, 2026 by
hshrivastava-droid
Collaborator
Loading…
[codex] perf: fuse MiniMax M3 allreduce and Gemma RMSNorm on MI300X
full-sweep-enabled
#1778
opened Jun 15, 2026 by
Oseltamivir
Collaborator
Loading…
[AMD] refactor: engine-neutral aiperf plotter + fill sglang panels
#1774
opened Jun 15, 2026 by
AMD-yanfeiwang
Loading…
2 of 3 tasks
[NVIDIA][GB300] update DSR1 FP8 GB300 TRTLLM image to latest
full-sweep-enabled
#1767
opened Jun 15, 2026 by
xinli-sw
Collaborator
Loading…
[Klaud Cold][Experimental][DNM] minimaxm3-fp8-mi355x-vllm-disagg: day-zero MoRI-IO disagg smoke test (1P TP8 + 1D TP8, conc 1)
non-canary-full-sweep-enabled
Run the full sweep without the canary gate (full search space, no trim)
#1762
opened Jun 14, 2026 by
functionstackx
Collaborator
Loading…
[Experimental][DNM till upstream PR merges][AMD] perf: hybrid MXFP8 MoE for MiniMax M3 on MI300X
full-sweep-enabled
#1753
opened Jun 14, 2026 by
Oseltamivir
Collaborator
Loading…
Minimax m3 gb200 agg lowconc
full-sweep-enabled
#1752
opened Jun 14, 2026 by
Oseltamivir
Collaborator
Loading…
MiniMax-M3 MXFP8 full sweep config for GB300
full-sweep-enabled
#1735
opened Jun 13, 2026 by
Oseltamivir
Collaborator
Loading…
2 of 5 tasks
MiniMax-M3 MXFP8 full sweep config for GB200
full-sweep-enabled
#1734
opened Jun 13, 2026 by
Oseltamivir
Collaborator
Loading…
1 of 2 tasks
Add b300-cw (CoreWeave B300) runner launch script and pool
#1730
opened Jun 12, 2026 by
JordanNanos
Collaborator
Loading…
feat(ci): add priority label to preempt runners for high-priority sweeps
#1726
opened Jun 12, 2026 by
cquil11
Collaborator
Loading…
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.