2025 • June • 07

Baby steps into semi-automatic coding

So I did a thing. I spent time this week building an actual project using an AI coding agent. I ended up with 11,000 lines of code that actually work. To be clear: it wasn't great code—lots of boilerplate, plenty of "I would have written this eventually anyway" stuff—but it did what I intended it to do. More importantly, it got done without me having to fight my ADHD through every tedious implementation detail.

Finding a workflow

Inspired by Harper Reed's LLM codegen workflow, I stumbled toward something that worked. The main thing? Don't just throw wishes at the LLM and expect anything useful to happen. Instead, I developed what I'll generously call a "process":

The Three Seashells (er, I mean files)

Each coding session became its own contained exploration:

Start fresh: New directory of docs in a new git branch
Write a spec.md: A couple hundred words of "here's what I actually want"
Get a plan.md: Ask the AI to turn my vague intentions into concrete steps
Iterate on the plan: Edit it, have the AI critique it, answer questions, repeat until it's actually coherent
Let the agent loose: "Okay, enact the plan"—or, sometimes "Make it so"
Babysit appropriately: Review the code as it emerges, redirect when it goes off the rails
Wrap up with notes.md: Have the AI reflect on what we built and what we learned

Interesting things happened in that final notes file. I asked the LLM to specifically capture all the unexpected detours from the plan and derive "lessons learned"—this prompting seemed to yield notes that ended up being genuinely useful.

Model musical chairs

I bounced between Claude Sonnet 3.7, GPT-4.1, and SWE-1. Each had its own personality:

Claude was the reliable worker—just got stuff done without much drama
GPT-4.1 was the overthinker—made elaborate plans, asked endless questions, asked repeatedly for permission, then drove straight into a ditch
SWE-1 fell somewhere in between (and happened to be free in Windsurf, so there's that)

Sometimes I'd switch models mid-session just to see what would happen. It was like watching different programmers take over the same project—each had their own quirks and texture.

What actually worked

Accidental waterfall (but slower)

Looking back, I accidentally stumbled into what I now see Harper Reed calls "waterfall in 15 minutes" - that structured planning-then-execution cycle that AI seems to naturally encourage. Except I was stretching those 15 minutes into longer, more deliberate sessions. Maybe that's inefficient, but the planning phase felt too important to rush through.

Scope management is everything

Don't try to boil the ocean. A few times I asked the AI to bite off more than it could chew, probably blowing out context windows in the process. When that happened, I'd stop, back up, and break the problem into smaller pieces.

One trick that worked well: get the plan dialed in, then start a fresh chat with just the plan and existing code as context. Less noise, more focus.

Git commits as save states

I got comfortable making lots of small commits—treating them like save points in a game. Whenever things were in decent shape, commit and clean up later. I forgot this advice a few times and lost progress when sessions went sideways. But, even then, I could usually scroll back through the chat history and recreate the good bits.

Occasional vibey riffing

The structured approach worked great for focused implementation, but sometimes I'd wander into chatty iteration mode for small tweaks and fussy updates. Both approaches have their place.

I'm seeing that agentic coding doesn't foster flow state immersion. And, honestly? That might be perfect for my ADHD brain. Instead of needing those long, uninterrupted blocks of deep focus, I could context-switch between planning, watching the AI work, and reviewing results. Turns out the "zone" might be overrated when you have a bot for a coding partner.

The surprising parts

What caught me off guard was how much this approach actually accomplished. I'm getting less surprised by where and how it fails—the failures feel manageable, and the results are serviceable.

It's not that the LLM is writing better code than I would (it's not), but it's handling all the tedious implementation work that my brain likes to avoid. The code looks a lot like what I would have written if I'd had the executive function to work at it as doggedly as a machine does.

Those notes.md files turned out to be genuinely useful—having the AI reflect on our entire collaboration captured insights I wouldn't have noticed otherwise. I keep getting distracted by trying to understand how the partnership works, not just making it more efficient. Probably inefficient compared to racing toward full automation, but it feels like there might be something worth learning in staying curious about the process itself.

Future experiments

I had a meta-conversation with Claude about improving this whole process. One interesting idea: use big cloud models for the planning phase, switch to a local model for implementation (save those tokens!), then switch back to a big model for the final summary.

I could also see bundling this whole workflow into its own agentic system. Because apparently what I need in my life is more automation of my automation. (Yo dawg.)

More immediately, I should probably start capturing some of the lessons from all these notes.md files into proper configuration files - .windsurfrules, .cursorrules, a CLAUDE.md file. Right now I'm learning the same lessons over and over across different sessions. Maybe it's time to start accumulating that meta-advice instead of rediscovering it each time.

Summing up

I think I'm getting comfortable with this stuff, though clearly I have a lot more to learn. It's not magic, and no one should trust it to write your next startup for you on its own. But as a way to get through the implementation grind while your brain does the interesting thinking? It's pretty solid.

Now excuse me while I go tinker further with 11,000 lines of moderately adequate code.