~/ai-muninn
blog
github
中
~ / blog
/
tag / mtp
❯
grep -r "#mtp" ~/blog
3 matches
date
read
title
2026-05-09
13m
Want MTP speedup on abliterated Gemma 4? Vanilla draft can't track the modified body
#gemma-4
#abliteration
#mtp
#speculative-decoding
2026-05-06
13m
Liftoff: Gemma 4 hits 670 tok/s aggregate on DGX Spark (108 tok/s single-stream)
#gemma-4
#mtp
#speculative-decoding
#vllm
2026-04-28
13m
[llm-compressor] Self-Quantizing a 35B Abliterated MoE to FP8 on DGX Spark: 4 OOMs, 3 Prefix Bugs, and Why the First Success Wasn't Actually FP8
#dgx-spark
#gb10
#sm121
#llm-compressor
← back to all posts