Debugging a Microkernel with AI Agents

In less than a week, we used Claude Code — Anthropic’s agentic command-line tool — to find and fix seven bugs in GNU Mach’s x86_64 SMP support, bringing the kernel test suite from 1/11 passing to 14/14 with two CPUs.

Along the way, we built a task runner system to orchestrate multiple AI agents as background processes and tracked their work in SQLite. All of this ran on a $100/month Claude subscription.

The paper covers what AI agents are, the GNU Mach project context, how human-AI interactive sessions worked, the task runner architecture, a detailed walkthrough of all seven kernel bugs found and fixed, cost analysis, and reflections on what worked and what didn’t.

Download the full paper (PDF)

Leave a Reply