~/ai-muninn
blog
github
中
~ / blog
/
tag / nvfp4
❯
grep -r "#nvfp4" ~/blog
7 matches
date
read
title
2026-04-07
10m
[Benchmark] From 19 to 50 tok/s: We Quantized Gemma 4 E4B to NVFP4 Before Anyone Else
#gemma-4
#e4b
#nvfp4
#fp8
2026-04-05
8m
[vLLM] Gemma 4 26B-A4B NVFP4 on DGX Spark: 52 tok/s with 16 GB of Weights
#gemma-4
#nvfp4
#vllm
#dgx-spark
2026-04-05
6m
[Benchmark] Gemma 4 31B Dense on DGX Spark: 7 tok/s and the Bandwidth Wall
#gemma-4
#nvfp4
#vllm
#dgx-spark
2026-04-05
7m
[Benchmark] vLLM vs Ollama on the Same Model: Why 30% Faster on GB10
#vllm
#ollama
#benchmark
#dgx-spark
2026-03-19
6m
[vLLM] Qwen3.5-122B Runs. But at 14 tok/s.
#dgx-spark
#sm121
#qwen3.5-122b
#vllm
2026-03-17
10m
[vLLM] Why Your DGX Spark Only Says "!!!!!": Debugging NVFP4 on SM121
#dgx-spark
#sm121
#vllm
#nvfp4
2026-03-13
9m
[vLLM] Nemotron-3-Super-120B on a Single GB10: Full Day Debug Log
#dgx-spark
#gb10
#sm121
#nemotron
← back to all posts