6 practices to add to your build stack

Adding rigor and discipline to your coding harness and build process

May 13, 2026

Two hours into building Baohua’s story-recording flow, after a long stretch of auto-accepts, Claude generated a PR. I opened the preview and the design was completely wrong. The agent had conflated the story-recording and curation flows, and made poor decisions about the app’s information hierarchy.

I closed the PR, started over, and burned another half day pulling the flow apart. Fifteen minutes of back-and-forth on an HTML prototype would have prevented every one of those mistakes. The second time around, with the prototype in hand, the feature got built correctly.

That moment is the entire thesis of this post.

My first post in this series, “The Solo Production Stack,” was about what tools to use to ship a solo app in 2026. This one is about how to operate them.

The default mode of AI-assisted building is yes. Yes to the next PR. Yes to the agent’s interpretation of what you wanted. Yes to the abstraction it invented this morning. Yes to the framework’s defaults. Saying yes is how you ship an average product.

Building something meaningful needs the discipline of knowing when to stop your AI agent, redirect it, and say no to its recommendations.

Below are six practices I picked up shipping Baohua, a bilingual story-sharing app I built with Claude Code. Each practice is a puzzle piece that helped me add rigor, discipline and my own product taste to the process and boost the otherwise mediocre product that Claude would’ve built on its own. They’re what separates building something average from building something meaningful.

None of these practices are revolutionary. They’re basic fundamentals of building good software. But AI coding is designed in a way that pulls you towards what they’ve been trained on, which is an average result. The defaults nudge you mindlessly saying yes and yes again, until you’ve shipped something the agent decided on for you.

Breaking out of that cycle takes a deliberate effort to keep applying critical thinking and rigor at every step. Keeping these principles in the back of my mind has been the thing that keeps the quality bar where I want it.

1. Prototype before build.

After the story-recording PR, I changed the rule: no new feature gets touched by an agent until I’ve approved the UX in an HTML prototype first. This step forces me to make the taste decisions (what goes where, what hierarchy matters, how the user moves) before the agent gets a chance to make them for me. Doing this helps me save hours of wasted coding and thousands of tokens.

My harness, gstack, ships with /plan-ceo, /plan-design, and /design-shotgun. They’re powerful, but if you just tell Claude Code to work on the next feature, it’ll skip past them and start making assumptions in code. I added routing into my gstack to flag features that affect the UI or user experience. Now it generates prototypes for me to review before touching the codebase.

When I know what I want, I iterate on the prototype to lock the scope before any real code gets written. When the feature is still just an idea, I run /design-shotgun first to widen the option space before narrowing it.

This is the same discipline as working with a designer before you build. Get clear on what you’re building and why, translate that into a clean user experience, then implement. It’s tempting to skip the step because the agent will let you. The result is sloppy UX every time.

2. Three is the cap.

I had six Claude sessions running in parallel one afternoon. I caught myself cycling through windows, clicking “yes,” approving work I hadn’t actually read. By the time I noticed, two of the sessions had drifted onto overlapping work and my critical thinking had gone offline somewhere around session four.

I felt productive and produced nothing. Three concurrent sessions is the most I can run while keeping a real model of what the system is doing. It’s tempting to spin up as many agents as possible to maximize parallelism, but agents don’t have the best judgment (see practice 1).

When I run 3 sessions, one session handles the meaty work, like a backend flow where I need to follow the logic, or a feature with real architectural choices in it. The other two sessions handle small surface changes: tightening email copy, adjusting a screen’s layout, fixing a typo.

The principles behind the cap: your mind can only deeply focus on one thing at a time, and that focus is the most valuable resource you bring to the build. Allocate it accordingly. Keep parallel tasks far enough apart that the agents don’t step on each other’s commits. And lean on Linear instead of spinning up another session every time a new idea hits. Slowing down to define a ticket, track it, and execute against it is the discipline that compounds. Slow is fast, and fast is smooth.

3. Scope is a daily fight.

AI’s default mode is maximalism. Ask for a “share story” button and the agent will quietly draft a notifications framework, a sharing-permissions DSL, and analytics events for everything.

Subtraction is taste. Only add complexity if it's truly necessary. The best products are simple, elegant, and delightfully easy. This is something AI does not understand.

This compounds with practices 1 and 2. When you prototype, you catch scope creep before code gets written. When you run too many parallel sessions, you miss it entirely.

I learned this the hard way. While building the interview scheduling system, Claude sneaked in the idea of an expired interview. The system would mark an open question prompt to a parent as expired if no activity happened after 48 hours, then close the interview entirely. Sending your parents a question to answer often takes more than two days to get a response. There’s no way I’d have added that on purpose.

But I was running too many sessions and didn’t catch it, until it created a bug with a real user whose mom tried to answer a question four days after she received it. I spent a whole day picking through the slop of the question-prompting system to undo a feature that should never have been there.

4. Two flavors of taste. Both matter.

Engineering taste catches the unnecessary abstraction. Product taste catches the unmet user need. The agent will miss both.

Engineering taste first. Codex caught some of the engineering misses for me. A memory leak in the audio recorder. A Supabase RLS policy that let users read other families’ stories. Gstack’s /review skill is solid for the obvious stuff: race conditions, idempotency, RLS policies.

Claude’s engineering taste falls short on higher-level design choices.

Take localization. Baohua translates most of the product experience into simplified and traditional Chinese. Claude’s first instinct was to scatter one-off localizations across the product copy. I had to walk it through centralizing the language system. One source of truth for English and Chinese today, with room for Korean and Japanese tomorrow.

A decent engineer would have caught this on day one. Claude needed to be told. That’s the gap between a vibe-coded prototype and a codebase that won’t bleed regressions six months in.

Product taste second. Claude Code knew how to implement magic links for auth, but did a poor job of integrating them into the user experience. The first pass shipped magic links to storytellers in Baohua with one-hour expirations and no other way to log in.

Imagine your mom getting an email she can only act on within the hour, after which she has to call you for a new one. That’s the kind of product taste your AI agent has.

I spent the better part of three days walking through every auth permutation for every user type. Magic links, Google auth, email and password. The goal was a consistent flow with as little friction as possible. There’s no AI shortcut for this. You have to walk every flow as the person who’s actually going to use it.

5. Mobile coding is pure magic.

One of Baohua’s first users messaged me on Instagram about a recording bug. I was at dinner with friends. I opened Claude Code Remote on my phone, connected to a session running on my Mac mini at home, and shipped the patch before the entrees arrived. A year ago this would have sounded absurd. Today it’s how I work.

A few things worth knowing about mobile coding. There are real tools to make this work: tmux, Tailscale, Claude Code Remote, and a growing number of others. They’re worth the afternoon it takes to wire them up. The simplest setup is to leave your laptop open with a terminal and a remote-control session so you can drive it from your phone.

What I’ve landed on is a terminal session permanently open on my Mac mini at home, which is also always-on. It means I can investigate issues anytime.

I’ve shipped fixes from the gym, from the airport, and from my couch while watching Crash Landing on You. This gives me confidence to support my codebase without an on-call team.

6. Nothing beats talking to users.

The oldest piece of advice in the book is still the most important one.

I shipped Baohua on April 30. Then I talked to my friend, who told me “my mom won’t do this without LINE.” I spent the next three days doing hairy restructuring to add a second messaging channel.

Then I talked to my mom, who told me “no Chinese speaker is going to understand what Baohua is if you don’t clearly lay it out for them. Your emails are too short.” Counterintuitive advice for someone like me, who skims emails and clicks the CTA. So I rewrote the email design.

There’s no version of these practices that gets you to the right product without this one. Everything above is how you build well. Talking to users is how you discover what users love.

The next frontier isn’t code.

One parting thought, beyond the principles of building well. There's a harder puzzle waiting on the other side. Now that everyone has an army of engineers in their pocket, learning how to find and connect with your customers is the new edge.

These six practices get you to a solid product. They don’t get you to a business.

The frontier isn’t how sexy your Claude Code setup is. 3 months from now, everyone will have their custom coding rig. The setup will be table stakes.

Building something average will cost nothing.

Building something meaningful will cost discipline and rigor.

Building something people love will cost you finding those people, talking to them, and reshaping your product around what they tell you.

The machinery to respond now costs $200 a month instead of $50,000 a month in engineering payroll. So what kind of business are you going to build?

The AI Craft

Discussion about this post

Ready for more?