#sm121 — Blog — ai-muninn

~ / blog / tag / sm121

❯ grep -r "#sm121" ~/blog

18 matches

datereadtitle
2026-06-1116m
[Benchmark] Qwen3.5-122B on DGX Spark — 2× faster
#qwen3.5 #dgx-spark #gb10 #gdn
2026-06-0110m
[Benchmark] NVFP4 W4A4 beats FP8 on a DGX Spark MoE: 67 vs 52 tok/s once CUDA graphs fire
#nvfp4 #w4a4 #fp8 #dgx-spark
2026-05-309m
NVFP4 is 1.5× FP8 on a DGX Spark — but it's compression, not the FP4 cores
#nvfp4 #fp8 #dgx-spark #gb10
2026-05-0615m
Liftoff: Gemma 4 hits 670 tok/s aggregate on DGX Spark (108 tok/s single-stream)
#gemma-4 #mtp #speculative-decoding #vllm
2026-05-0412m
[Field Guide] Z-Image Turbo — choosing the right config (1.37× faster, 44% less RAM)
#z-image #comfyui #nvfp4 #fp8
2026-05-0114m
[vLLM] Nemotron 3 Nano on DGX Spark: 74.75 tok/s NVFP4 — 11.5% Past the Public Baseline
#nemotron-3 #nvfp4 #vllm #dgx-spark
2026-04-2814m
[llm-compressor] Self-Quantizing a 35B Abliterated MoE to FP8 on DGX Spark: 4 OOMs, 3 Prefix Bugs, and Why the First Success Wasn't Actually FP8
#dgx-spark #gb10 #sm121 #llm-compressor
2026-04-2214m
[Hands-On] Making NVFP4 17% Faster on GB10 with a Triton FP8 Bypass
#nvfp4 #fp8 #triton #dgx-spark
2026-04-218m
[Benchmark] NVFP4 Is a Trap on GB10: FP8 Wins by 32% (vLLM + SGLang Tested)
#nvfp4 #fp8 #dgx-spark #gb10
2026-04-138m
[DGX Spark] From Unboxing to Running: Complete Deployment Guide
#dgx-spark #gb10 #gx10 #vllm
2026-04-056m
[Benchmark] Gemma 4 31B Dense on DGX Spark: 7 tok/s and the Bandwidth Wall
#gemma-4 #nvfp4 #vllm #dgx-spark
2026-04-059m
Gemma 4 26B-A4B on DGX Spark: 52 tok/s with NVFP4, skip the 31B
#gemma-4 #nvfp4 #vllm #dgx-spark
2026-03-308m
[Benchmark] TurboQuant on GX10: Is 3-bit KV Cache Compression Actually Lossless?
#turboquant #kv-cache #quantization #vllm
2026-03-216m
[vLLM] FP8 KV Cache on GB10: Why Outputs Collapse into Repetition Loops
#vllm #fp8 #kv-cache #gb10
2026-03-1912m
[vLLM] Running a 120B Model on DGX Spark at 60 tok/s — Zero API Cost, Six Bugs
#dgx-spark #sm121 #vllm #gpt-oss
2026-03-197m
[vLLM] Qwen3.5-122B Runs. But at 14 tok/s.
#dgx-spark #sm121 #qwen3.5-122b #vllm
2026-03-1711m
[vLLM] Why Your DGX Spark Only Says "!!!!!": Debugging NVFP4 on SM121
#dgx-spark #sm121 #vllm #nvfp4
2026-03-1310m
[vLLM] Nemotron-3-Super-120B on a Single GB10: Full Day Debug Log
#dgx-spark #gb10 #sm121 #nemotron

← back to all posts