~/ai-muninn
blog
github
中
~ / blog
/
tag / llm-compressor
❯
grep -r "#llm-compressor" ~/blog
2 matches
date
read
title
2026-04-28
13m
[llm-compressor] Self-Quantizing a 35B Abliterated MoE to FP8 on DGX Spark: 4 OOMs, 3 Prefix Bugs, and Why the First Success Wasn't Actually FP8
#dgx-spark
#gb10
#sm121
#llm-compressor
2026-04-07
10m
[Benchmark] From 19 to 50 tok/s: We Quantized Gemma 4 E4B to NVFP4 Before Anyone Else
#gemma-4
#e4b
#nvfp4
#fp8
← back to all posts