#quantization — Blog — ai-muninn

~ / blog / tag / quantization

❯ grep -r "#quantization" ~/blog

3 matches

datereadtitle
2026-04-0710m
[Benchmark] From 19 to 50 tok/s: We Quantized Gemma 4 E4B to NVFP4 Before Anyone Else
#gemma-4 #e4b #nvfp4 #fp8
2026-03-308m
[Benchmark] TurboQuant on GX10: Is 3-bit KV Cache Compression Actually Lossless?
#turboquant #kv-cache #quantization #vllm
2026-03-216m
[vLLM] FP8 KV Cache on GB10: Why Outputs Collapse into Repetition Loops
#vllm #fp8 #kv-cache #gb10

← back to all posts