Hans Christian Thjømøe

Hans Christian Thjømøe — BlogNotes on software architecture, AI tooling, agentic workflows, and self-hosted local AI.https://www.neoteric.no/en-usClaude Code's /simplify Stopped Fixing Code Yesterdayhttps://www.neoteric.no/blog/claude-code-s-simplify-stopped-fixing-code-yesterday/https://www.neoteric.no/blog/claude-code-s-simplify-stopped-fixing-code-yesterday/Claude Code 2.1.147 renamed /simplify to /code-review and dropped the auto-fix behavior. The new command reports bugs at chosen effort levels but no longer changes code.Fri, 22 May 2026 00:00:00 GMTaiclaudeagentic-workflowstoolingpost@neoteric.no (Hans Christian Thjømøe)Your Private MCP Server Is Now Claude-Reachablehttps://www.neoteric.no/blog/your-private-mcp-server-is-now-claude-reachable/https://www.neoteric.no/blog/your-private-mcp-server-is-now-claude-reachable/Anthropic shipped MCP tunnels on May 19. Claude agents can call internal databases, ticketing systems, and on-prem APIs through one outbound connection — no inbound firewall rules required.Thu, 21 May 2026 00:00:00 GMTaiclaudemcpagentic-workflowspost@neoteric.no (Hans Christian Thjømøe)Your vLLM Thinking Budget Was Doing Nothing With MTP Onhttps://www.neoteric.no/blog/your-vllm-thinking-budget-was-doing-nothing-with-mtp-on/https://www.neoteric.no/blog/your-vllm-thinking-budget-was-doing-nothing-with-mtp-on/vLLM 0.21.0 shipped Friday with a quiet fix: thinking_token_budget was being silently ignored when MTP speculative decoding was enabled. If you serve reasoning models with spec decode, you have been paying for it.Mon, 18 May 2026 00:00:00 GMTailocal-aivllmspeculative-decodingbenchmarkspost@neoteric.no (Hans Christian Thjømøe)Claude Code v2.1.100+ Burns ~20K Phantom Tokens Per Requesthttps://www.neoteric.no/blog/claude-code-v2-1-100-burns-20k-phantom-tokens-per-request/https://www.neoteric.no/blog/claude-code-v2-1-100-burns-20k-phantom-tokens-per-request/A server-side bug in Claude Code v2.1.100+ inflates every request by roughly 20K cache_creation tokens — about 40% overhead. Pin v2.1.98 until fixed.Sun, 17 May 2026 00:00:00 GMTaiclaudeagentic-workflowsbenchmarksindustry-signalpost@neoteric.no (Hans Christian Thjømøe)Your Local Qwen3.6 Throughput Probably Just Halved (and How to Fix It)https://www.neoteric.no/blog/llama-cpp-mtp-flag-rename/https://www.neoteric.no/blog/llama-cpp-mtp-flag-rename/llama.cpp renamed the MTP flag on May 13. The old --spec-type mtp is silently ignored. If your tok/s dropped from 140 to 70 you are likely running without speculative decoding.Sat, 16 May 2026 00:00:00 GMTaillama.cppqwenlocal-aispeculative-decodingpost@neoteric.no (Hans Christian Thjømøe)MCP Server Roundup: Which Are Actually Worth Adding to Your Setup in May 2026https://www.neoteric.no/blog/mcp-server-roundup-may-2026/https://www.neoteric.no/blog/mcp-server-roundup-may-2026/Eighteen months after Anthropic released MCP, the ecosystem is wide enough that picking the wrong servers slows your agent down. Here is the practical short list — what to install, what to skip, and the trap most people fall into.Sat, 16 May 2026 00:00:00 GMTaimcpclaudetoolingagentic-workflowspost@neoteric.no (Hans Christian Thjømøe)Speculative Decoding Explained: Why Your Local Model Got 2× Faster in 2026https://www.neoteric.no/blog/speculative-decoding-why-local-ai-got-fast/https://www.neoteric.no/blog/speculative-decoding-why-local-ai-got-fast/The same Qwen3.6-27B that ran at 70 tokens/sec on a 4090 in January was running at 140 tokens/sec by April. Nothing changed about the model. Speculative decoding moved from research curiosity to default. Here is what it actually does.Sat, 16 May 2026 00:00:00 GMTailocal-aillama-cppperformancespeculative-decodingpost@neoteric.no (Hans Christian Thjømøe)Third-Party Claude Agents Lose the Subscription Subsidy June 15https://www.neoteric.no/blog/third-party-claude-agents-lose-the-subscription-subsidy-june-15/https://www.neoteric.no/blog/third-party-claude-agents-lose-the-subscription-subsidy-june-15/Anthropic is splitting Claude billing on June 15 — Agent SDK and ACP usage moves to a capped credit pool ($20/$100/$200) at full API rates.Sat, 16 May 2026 00:00:00 GMTaiclaudeagentic-workflowsindustry-signalpost@neoteric.no (Hans Christian Thjømøe)The Local AI Inflection Point: May 2026https://www.neoteric.no/blog/local-ai-inflection-point-may-2026/https://www.neoteric.no/blog/local-ai-inflection-point-may-2026/Three model releases in three weeks moved local AI from 'good enough for hobbies' to 'good enough for production'. Here's what changed and why it matters.Fri, 15 May 2026 00:00:00 GMTailocal-aiqwengemmaself-hostedpost@neoteric.no (Hans Christian Thjømøe)Running Qwen3.6-27B Locally: Hardware, Quantization, and What Actually Workshttps://www.neoteric.no/blog/running-qwen-3-6-27b-locally/https://www.neoteric.no/blog/running-qwen-3-6-27b-locally/A practical guide to running Qwen3.6-27B on consumer hardware in 2026 — memory requirements per quant level, recommended runners, and the MTP trick that doubles your tokens per second.Mon, 11 May 2026 00:00:00 GMTaiqwenlocal-aillama.cpphomelabpost@neoteric.no (Hans Christian Thjømøe)A 27B Model on a Single GPU Is 10 Points Off Claude Opus 4.7https://www.neoteric.no/blog/qwen-3-6-27b-vs-claude-opus-4-7-benchmarks/https://www.neoteric.no/blog/qwen-3-6-27b-vs-claude-opus-4-7-benchmarks/Qwen3.6-27B running locally now scores within 10 points of frontier closed models on SWE-bench Verified. The benchmark table, lined up side by side.Fri, 08 May 2026 00:00:00 GMTaiqwenclaudelocal-aibenchmarkspost@neoteric.no (Hans Christian Thjømøe)Claude Opus 4.7: 87.6% on SWE-bench and 1M Context at Standard Pricinghttps://www.neoteric.no/blog/claude-opus-4-7-coding-leap/https://www.neoteric.no/blog/claude-opus-4-7-coding-leap/Anthropic shipped Opus 4.7 on April 16, 2026, with a seven-point SWE-bench jump, the 1M context window now generally available with no premium, and a new task budget primitive for agent loops.Fri, 17 Apr 2026 00:00:00 GMTaiclaudeanthropiccodingagentspost@neoteric.no (Hans Christian Thjømøe)Gemma 4: Google's Open Model Family Goes Multimodalhttps://www.neoteric.no/blog/google-gemma-4-open-models/https://www.neoteric.no/blog/google-gemma-4-open-models/Google released Gemma 4 on April 2, 2026 — four variants from 2B to 31B, with 256K context, native vision and audio, and Apache 2.0 licensing. Here's what it's for, where it fits, and how to run it.Sun, 05 Apr 2026 00:00:00 GMTaigooglegemmalocal-aimultimodalpost@neoteric.no (Hans Christian Thjømøe)What's New in Optimizely CMS 13: The Big Picturehttps://www.neoteric.no/blog/optimizely-cms-13-whats-new/https://www.neoteric.no/blog/optimizely-cms-13-whats-new/Optimizely CMS 13 went GA on April 1, 2026. Visual Builder is now the default editor, Content Manager replaces tree-first navigation, Optimizely Graph and Opti ID are mandatory, and the platform jumps to .NET 10. Here's what actually changed, where it's worth caring, and what the upgrade is going to cost you.Wed, 01 Apr 2026 00:00:00 GMToptimizelycmsdotnetheadlesspost@neoteric.no (Hans Christian Thjømøe)Claude Opus 4.6: A Million-Token Context and a New Agent Team Modelhttps://www.neoteric.no/blog/claude-opus-4-6-million-token-context-and-agent-teams/https://www.neoteric.no/blog/claude-opus-4-6-million-token-context-and-agent-teams/Anthropic released Opus 4.6 on February 5, 2026, with a 1M token context beta, agent teams, adaptive thinking, and developer effort controls — all at the same price as 4.5.Fri, 06 Feb 2026 00:00:00 GMTaiclaudeanthropicagentspost@neoteric.no (Hans Christian Thjømøe)Introducing Azure DevOps Workflow: Manage Work Items Without Leaving VS Codehttps://www.neoteric.no/blog/azure-devops-workflow-vscode-extension/https://www.neoteric.no/blog/azure-devops-workflow-vscode-extension/A VS Code extension that brings Azure DevOps sprint boards, work item management, and AI-powered assistance directly into your editor.Tue, 25 Nov 2025 00:00:00 GMTvscodeazure-devopsproductivityopen-sourcepost@neoteric.no (Hans Christian Thjømøe)Claude Opus 4.5: Anthropic's New Flagship Model Sets the Bar for AI Codinghttps://www.neoteric.no/blog/claude-opus-4-5-anthropics-new-flagship/https://www.neoteric.no/blog/claude-opus-4-5-anthropics-new-flagship/Anthropic's latest model achieves state-of-the-art results in agentic coding and brings meaningful improvements across reasoning, mathematics, and everyday tasks.Tue, 25 Nov 2025 00:00:00 GMTaiclaudeanthropiccodingpost@neoteric.no (Hans Christian Thjømøe)Google Gemini 3 Pro: The New Leader in Multimodal AIhttps://www.neoteric.no/blog/google-gemini-3-pro-multimodal-reasoning/https://www.neoteric.no/blog/google-gemini-3-pro-multimodal-reasoning/Google's Gemini 3 Pro brings generative interfaces, 1M token context, and state-of-the-art multimodal reasoning to developers and consumers alike.Tue, 25 Nov 2025 00:00:00 GMTaigooglegeminimultimodalpost@neoteric.no (Hans Christian Thjømøe)