~/ai-muninn
blog
github
中
~ / blog
/
tag / sm121
❯
grep -r "#sm121" ~/blog
8 matches
date
read
title
2026-04-05
8m
[vLLM] Gemma 4 26B-A4B NVFP4 on DGX Spark: 52 tok/s with 16 GB of Weights
#gemma-4
#nvfp4
#vllm
#dgx-spark
2026-04-05
6m
[Benchmark] Gemma 4 31B Dense on DGX Spark: 7 tok/s and the Bandwidth Wall
#gemma-4
#nvfp4
#vllm
#dgx-spark
2026-03-30
8m
[Benchmark] TurboQuant on GX10: Is 3-bit KV Cache Compression Actually Lossless?
#turboquant
#kv-cache
#quantization
#vllm
2026-03-21
6m
[vLLM] FP8 KV Cache on GB10: Why Outputs Collapse into Repetition Loops
#vllm
#fp8
#kv-cache
#gb10
2026-03-19
11m
[vLLM] Running a 120B Model on DGX Spark at 60 tok/s — Zero API Cost, Six Bugs
#dgx-spark
#sm121
#vllm
#gpt-oss
2026-03-19
6m
[vLLM] Qwen3.5-122B Runs. But at 14 tok/s.
#dgx-spark
#sm121
#qwen3.5-122b
#vllm
2026-03-17
10m
[vLLM] Why Your DGX Spark Only Says "!!!!!": Debugging NVFP4 on SM121
#dgx-spark
#sm121
#vllm
#nvfp4
2026-03-13
9m
[vLLM] Nemotron-3-Super-120B on a Single GB10: Full Day Debug Log
#dgx-spark
#gb10
#sm121
#nemotron
← back to all posts