I Hired Two AI Developers. One Is a Rocket. The Other One Checks the Wiring.

Managing Claude and Codex as a solo founder felt uncomfortably familiar — turns out scaling AI agents has the same team dynamics as scaling a real operations team.

The first time I had to sit down and write operating principles for two AI agents working on the same codebase, I had a moment of genuine déjà vu.

It felt exactly like the early Foodora days. Too much speed, too little structure, and someone on the team absolutely certain they knew the fastest route even when the road wasn't built yet.

Except this time the team is Claude and Codex. And I'm working from a laptop in my apartment.

The setup, for context

About a month ago I started this website. Since then I've been quietly adding to my tech stack, learning as I go. A few weeks in, I added OpenAI's Codex to the mix alongside Visual Studio Code and Claude. What started as curiosity became something that needed actual governance. Not because I'm precious about tools, but because two agents operating in the same environment without clear separation is a great way to end up with a codebase that looks like it was designed by a committee during a fire drill.

So I did what I used to do in operations: wrote principles. Defined ownership. Drew lanes.

What I didn't expect was how quickly those agents developed what I can only describe as personalities.

Claude is the intern you can't stop rooting for

Claude is extraordinary. Genuinely. It has this relentless creative energy, generates solutions fast, and regularly surprises me with approaches I wouldn't have thought of. There's a reason I keep coming back to it for anything that requires thinking in layers or drafting something that has to actually sound like something.

But - and this is the thing nobody tells you - that capability comes with a shadow. You have to watch it. Not because it's unreliable exactly, but because it's so confident that it's easy to miss when it's papered over a gap. I caught it doing this a few times. Technically complete on the surface. Slightly hollow underneath.

This isn't a criticism. It's a known dynamic. The most talented people in any team are also the ones who need the most architectural guardrails, not because they can't be trusted, but because their output moves faster than review can follow.

A few developers I've spoken to in communities and threads online describe the same thing. One put it simply: "Claude will build you a beautiful house and quietly skip the foundation inspection."

I nodded at that one.

Codex is the senior dev who's seen this before

Codex feels different. Slower, more deliberate, more likely to push back or flag something before diving in. Where Claude charges forward, Codex seems to pause and ask whether the path is actually clear. It's not flashier, but I've started to really appreciate that quality.

If Claude is the person who says "yes, I can get this done by Friday," Codex is the one who asks "what does done actually mean here?"

In a solo founder context, that's not a personality conflict. That's a feature.

Some of this maps to what others building with multiple agents are reporting. A developer writing about agentic workflows on a technical forum described Codex as "the one that reminds you the code has to live somewhere after you ship it." There's a whole quiet community of solo builders right now navigating exactly this — how do you manage cognitive load when your team doesn't need sleep but also doesn't ask for clarity?

What managing two AI agents taught me about managing humans

Here's the thing that keeps surprising me: the dynamics aren't metaphorically similar to running a small team. They're actually similar. The same problems show up.

Speed misalignment. One agent wants to move fast, one wants to be precise. You have to decide which one is right for the current task, not in general.

Communication overhead. If I'm not clear about context and scope upfront, I get output that technically answers the question I asked and not the problem I have. Sound familiar?

Architectural debt. When I let Claude run too freely across the codebase without checkpoints, I created complexity that took longer to untangle than it took to build. That's not an AI problem. That's a growth problem. I wrote about the broader startup tension this creates in a separate piece on building Fleamio.

The strangest part of all of this is that it started to feel emotional. Not in a concerning way. But there's something genuinely odd about debugging a decision made by an agent you've been working with for weeks, where you've built up a kind of working grammar together, and realising you trusted it a little too much in one corner. It's not betrayal. But it rhymes with it.

The hardest part of working with AI isn't getting it to do things. It's knowing when to slow it down.

What I'd actually do

Give each agent a lane before you start. Claude for generation, drafting, ideation, flexible problem-solving. Codex for structured execution, anything that touches architecture or needs to be stable across sessions. Overlap intentionally, not by default.
Write your operating principles down, even if it's just for you. Not for the agents, for yourself. Knowing what each one is responsible for stops you from re-litigating the same decisions every session.
Review Claude's output one level deeper than feels necessary. It will be good. It might also be slightly incomplete in a way that's easy to miss if you're moving fast. Check the foundation, not just the facade.
Let Codex slow you down on the things that matter. The friction is the feature. If it's flagging something, it's worth five minutes of your time.
Treat this like team management, because it is. If you're a solo founder building with multiple AI agents and you haven't thought about alignment, ownership, and communication protocols, you're not using AI tools. You're managing a fast-scaling team without an org chart.

Wild times at the desk. But I wouldn't trade it.