~/ai-muninn
search
⌘K
blog
github
中
~ / blog
/
tag / flash-attention
❯
grep -r "#flash-attention" ~/blog
1 match
date
read
title
2026-06-14
10m
[Just for Fun] On a GTX 970, Flash Attention nearly doubles long-context decode (24.3 → 42.5 tok/s)
#gemma-4
#gtx-970
#flash-attention
#kv-cache
← back to all posts