~ /home/coolthor
ai-muninn
Research notes on AI infrastructure, LLM serving, and autonomous agents. Things that took too long to figure out, written down so you don't have to.
❯ whoami
hardware enthusiast running 120B models at home on DGX Spark
building options trading infrastructure with AI agents
occasionally ships iOS apps
❯ cat ~/blog/concepts
Concepts & Methods
For those who want to understand how AI works
- 2026-04-14[Ask AI Right] The Art of Follow-Up Questions — What to Do When the First Answer Is Too Shallow
The first answer AI gives you is a rough draft, not the final answer. Learn 5 follow-up techniques — adding constraints, asking for comparisons, and letting AI ask YOU questions — to get dramatically better results.
- 2026-04-14[LLM 101] Context Window — How Much Can AI Read at Once?
AI forgets what you said 20 messages ago. It's not broken — its desk is full. This guide explains context windows, why conversations go stale, and how to work around the limit.
- 2026-04-13[Ask AI Right] Before You Build It, Ask: Does This Already Exist?
Your first question to AI shouldn't be 'help me do X.' It should be 'is there something that already does X?' This article teaches you how to use AI as a research assistant — finding tools, comparing alternatives, and verifying they're still alive.
- 2026-04-11[Ask AI Right] Why AI Feels Useless to You — Answer Machine vs Collaboration Tool
Same AI, same question, different results. The people who find ChatGPT life-changing and the people who think it's useless are doing completely different things — and the difference is a single mindset shift.
- 2026-04-10[Ask AI Right] You Don't Know What You Need — Let AI Find It
Most people don't struggle with using AI — they struggle with knowing what to use it for. This article teaches you a simple method to let AI identify the repetitive parts of your workday you've stopped noticing.
❯ cat ~/blog/field-notes
Field Notes
For those who run models and debug the hard way
- 2026-04-15[AI Agent] Gemma 4 26B Cleared a SWE-bench Lite Instance — After 28 Tries Across Two Days
Two days running mini-swe-agent + vLLM on a GB10. From wrong doc conclusions to Gemma 4 self-submitting a clean patch in 38 steps — what actually unlocked it.
- 2026-04-15[LLM Deep Dive] What Quantization Algorithms Actually Do: From Q4_K_M to TurboQuant
How does Q4_K_M fit a 14B model into 4 bits without ruining it? Not by 'cutting off 75%' — but through three layers: K-quant super-blocks, TurboQuant random rotation, and a 1-bit JL sign sketch. A mechanism walkthrough without the equations.
- 2026-04-13[Claude Code] Build a Self-Auditing Skill That Keeps Your Config Lean
Your CLAUDE.md and MEMORY.md grow silently until they eat 10K+ tokens per turn. I built a /slim skill that lets Claude diagnose and fix its own bloat — here's how.
- 2026-04-13Claude Code Burning Through Tokens? 8 Fixes to Make Sessions Last 10x Longer
You just started using Claude Code and the context window keeps filling up. Here's where the tokens actually go, what you can do about it, and how to make Claude remember things without re-reading everything.
- 2026-04-13[DGX Spark] From Unboxing to Running: Complete Deployment Guide
Everything you need to go from a sealed DGX Spark box to serving your first local LLM. Hardware check, Ollama quickstart, vLLM production setup, model selection, and the 5 gotchas that cost hours.