AI Writes 100% of My Code and 0% of My Code
Something has changed.
Almost every senior engineer I know is writing very little code. Instead, models like Claude 4.5 Sonnet, GPT-5, GPT-5-Codex, and Cheetah are writing that code.
This is new. Very new. It wasn’t true six months ago.
It changed with the release of Claude Sonnet 4 in May. It went from, “this is kinda shit,” to “this is pretty good.” And then it really shifted with Sonnet 4.5 and GPT-5-Codex over the last month where it went to “this is pretty freaking amazing.”
These are not “vibe coders” shouting “this is wild” and “game changer” on every other X post about how you can write an “app” in ten minutes with a single prompt. These are serious engineers, people with 10 to 20 years of experience. And they’re shipping production code while writing pretty much none of it.
Back in the dark ages of six months ago AI couldn’t code its way out of a paper bag without you wanting to throw your laptop across the room in a rage. It’s only very recently that the world has shifted. But it has shifted.
Any engineer who insists that AI can’t code or only produces slop, or that they have to write 100% of their own code by hand, is going off an old mental pattern match that was true but is now dead wrong. That doesn’t mean that many amateur coders do not produce absolute garbage through their ignorance and pollute repos with trash PRs that don’t work. But experienced coders can now produce real, strong, working code with these little machines.
In March 2025, Dario Amodei, CEO of Anthropic said in “3 to 6 months, AI will be writing 90% of the code.” I pushed back on this. A lot of other folks did too.
Turns out he was right.
And he was totally wrong at the same time.
He was right because AI is writing 70%-100% of the code for me and many other engineers out there. But he was wrong because the implication was that it would be doing it without a human-in-the-loop.
In other words, we would just tell our little magic miracle machine what we want and then go do something else and then come back a few hours or days later and — bam — you got a working app! It will figure it all out on its own, make wise choices between challenging alternatives, set it up, test it and you can do more interesting things with your time.
That kind of automation remains a complete and total fantasy.
Despite classic Musk hype of Grok 5 having a 10% chance of being “AGI”, there is no model on the planet that is even close to this right now, for a thousands reasons. Take today’s model architecture and training techniques and algorithms and today’s data and hurl 200K GPUs at it, or 1M, or 10M, and the results will be the same: !== AGI.
The biggest reason is that even if the models get a lot smarter and more capable, they are not and cannot ever be, magic. The machine can’t read your mind or know what a good app actually is, unless it’s just a shitty clone of something that already exists. Only you can know that.
Without YOU, these coding bots are utterly and totally and completely worthless. Without you and me these models would just go off the rails and fail miserably. And yet, given a strong and well thought-out plan, that I have carefully designed and workshopped with the models themselves and with my team of real live people, these bots can and do work for a half hour and produce stellar code.
That is the paradox of today’s AI. The models are both freaking amazing and also completely idiotic at the same time.
I’ve watched Gpt-5-Codex code from a plan I laid out for two hours straight, correct lint errors, run tests, fix problems and then follow all of that great work up with a ridiculous mistake that a first day coder would not make.
My favorite was when it decided that the entire purpose of the Typescript deep research clone I was building (based on Langchain’s Python based Open Deep Research) was to research my test prompt of “find an under desk exercise cycle, in Europe. Pick the top five and cite evidence for or against.” After happily working for hours, it decided to hardcode that prompt into the search parameters for Tavily web search for every possible research question.
I screamed at it in all caps and rolled back the commit.
Look, two things can be true at the same time. Andrew Karpathy made waves recently on the Dwarkish podcast when he said today’s agents are trash and that real agents that can work like people are a decade away. I agree. True agents that can think, remember, plan, learn on the fly, adapt, socialize, gather context from external sources, remember shifting requirements and the big picture, create new ideas, and work like people are a decade away, if not longer or much longer. We need a different architecture and a different way to train these machines.
But again, so what? We still have little magic genies that are writing 70%-100% of the code! Today’s models are still spectacular and useful, even if they’re also maddening and idiotic at times too.
So I should be scared for my job then? I mean, it’s doing most of my job, right?
Except it’s not.
Because being a software engineer is 90% thinking and 10% writing code. So while from one angle it’s writing 100% of the code (actually more like 70%), from another angle it’s writing 0% of my code because without my skilled interventions and explanations it would have no idea what to build or why.
You also have to watch these machines like a hawk or they will fuck up and fuck up big time.
Sometimes I feel like a safety driver in a self-driving car, half asleep at the wheel, lulled into thinking this thing is super human and can do no wrong, and then it suddenly just veers off the road because it saw a shadow and I have to wake up out of a total stupor and grab the wheel back.
But at the same time, they’re good. Very, very good, and getting better all the time and so that’s why we’ve seen a shift where suddenly very senior engineers move into vibe engineering. They make extensive plans and then let the AI rip and test and refine while drinking coffee or making lunch or watching a podcast or learning French.
If this terrifies you, it shouldn’t.
You should be very, very excited, because you and your team are about to be able to do a lot more with a lot less. Think smaller teams shipping bigger projects, faster.
Many Paths to Victory
I will lay my cards on the table and say I love the new world. Absolutely love it.
Talking to a computer and having it do things has been a dream of mine ever since I started programming (badly) in C/C++ 25 years ago. Honestly, I’ve always hated programming. I’ve tried to get better at it a number of times over my life but it always felt too rigid, too inexpressive, too structured, versus the fluid fantastic and ever-adaptable world of human language.
Things have come full circle. Everyone is programming in English now and I’m here to meet the future where I always wanted to meet it. I’ve been ready for the “Hello, Computer“ era since I was a kid.
Just talk to the agent!
And you know what? My sense of coding has changed. I absolutely love it. I even love writing code by hand weirdly enough now.
So how does this all work? What I’m finding as I talk to more and more engineers is that everyone’s process is different. They find something that works for them and works well and they do great work.
My senior backend engineer breaks everything down into atomic tasks and makes sure only one thing is done in the space of a single context window loop. After that she compacts deliberately.
Ex-Vercel engineer and Typescript instructor Matt Pocock, does the same. He ruthlessly clears out context. In a reply, where I questioned why anyone wants a clean context he wrote back and said “The reason is all the research that shows LLMs perform better with fewer input tokens.”
Maybe the research does say that. But in practice, I rarely see the performance of the agent degrade because of ‘context poisoning.’ In fact, I see it get worse right after compacting when the model shows up like Mr Meeseeks with no freaking memory. I have to explain everything all over again. No thank you.
As for too much context? Sure. Sometimes, but not as often as people think.
I’ve developed a Spidey-Sense for when a Cursor or Claude Code thread has gotten too long and I flip to a new one but in general ‘context poisoning’ feels like straight nonsense, much like ‘model collapse’ last year. It’s something folks glom onto because it sounds good but in general it doesn’t play out in reality. It’s the kind of magic people believe because a few papers pointed it out.
And that’s the key for anyone coding furiously with AI today:
Don’t trust and verify.
There is no one path to victory. There are many.
Try lots of things. Be an explorer. Try every tool that comes out. But try it with critical thinking.
Don’t believe the hype, the research, someone on Twitter, or some article that says this is the only way to do it. These machines may not be AGI, but they are general-purpose-ish and that means they can do a lot of good work if you figure out how to work with them and their bizarre little quirks. Put your critical thinking cap on! Always be asking the questions:
Does this match my experience?
Does this really work this way?
Why?
Does it work at all?
Is there a better way?
How?
Don’t accept anything at face value. Prove it to yourself. Prove it in your own workflow. Prove it in the code.
My Workflow
So what is my workflow? It basically boils down to a few things:
Plan 80% of the time in pseudo-code/text.
Have the agent write code only 10% of then time and then test, test, test and bug fix the other 10%.
I also break out any commands I use over and over in my Flashbacker system for Claude Code.
Let’s break it down.
Working-shopping With AI
A lot of people think writing prompts is hard. As a writer for 30 years, I can’t relate in the least. Writing is my super power and describing what I want clearly and effectively is what I do.
I call the first phase workshopping with AI.
I tend to write out a long, rambling prompt of what I want to do, with links to anything that helps explain it. I then tell Claude to:
“Output your understanding of what I said. Output a plan to address it. Include a TLDR. Make no changes. Just plan.”
If I do have a magic prompt, that’s it.
I should probably have this as a slash command as I type it often enough. But I just type it as a force of habit.
It serves several key purposes:
First, it makes it immediately clear when the model does not understand what you wanted. If it doesn’t understand, you don’t need to read the rest of what it wrote. You can just prompt it again with a different/better explanation.
It’s very easy to get sucked into thinking these things are human and that they always understand you.
They do not.
And the faster you realize that and prepare for it, the better off you are.
Second, it includes a TLDR. Agents often write a tremendous amount of text very fast and reading it all is a great way to get exhausted in short order. So ask for a TLDR. Again, if it doesn’t make sense, you can skip reading the rest and re-prompt it.
Third, if all that makes sense you can then read the plan. You’ll spot other things. A file in the wrong place. Another misunderstanding. A duplicate function. Something it missed. Code that doesn’t need to exist because there is a better way to do it or a different way to do it, etc.
Once I have this I may prompt to refine it and tell it to “output the updated plan.”
I will then copy and paste that command into Cursor/GPT-5-Codex and say something like:
“This is what Claude thinks about what we are doing. Look at it with critical thinking and a critical eye. Tell me if you agree or disagree and cite evidence for or against in the code or documentation.”
I will then bounce back and forth a few times, doing my own docs reading, and code reading, and sharing that context with the agents, until I have it refined. I also tell Claude to strip out all the code that it hallucinated on the fly and just put it in with references in comments so that the actual implementation agent does not follow it blindly.
Basically, it all boils down to workshopping and refining a plan.
This is something you do in your mind’s internal dialogue or with notes or pen and paper without realizing it. Now you are externalizing it to give the agent proper understanding.
Capturing the Plan and Turning It into Tasks
After the plan is ready, I tell Claude to capture it into an issue using my /fb:create-issue command. It’s really just a prompt to tell it where to put that issue (docs/issues) and what should be in it. You could mod it to live in Git issues or whatever you like. I prefer it in a local file I can easily read. I then quick read it to see if it captures what I want and then I run the /fb:create-tasks command, another prompt that basically says decompose this into atomic, checkbox-able tasks. Those live in docs/tasks/{NAME_OF_ISSUE}/
Now that I have those tasks I will review them and if I see something is missing or needs to change. I will just change it if it is a small edit, or if it is a lot, I will tell Claude to update it with the /fb:/update-tasks command and then double check them.
They look like this:
Now that I have the tasks I can basically @ them in Cursor for GPT-5-Codex and say:
“Review all these and then output your understanding of what we are working on and output a plan to begin.”
I review it’s plan to make sure it matches up and then I just type:
“Do it.”
I can watch it code for hours this way.
Sometimes I use the new Cheetah stealth model in Cursor and it is blindingly fast. It feels like a diffusion model. It is ridiculously fast and can edit and read multiple documents at once. It is also stupid. So only give it very targeted, scoped tasks like import updates or doc updates. It can do those at the speed of light and will save you a lot of time. But let it write your new code and it is going to screw up.
But otherwise, pick a workhorse model like Codex and make that your main implementer agent.
Along the way I may need to stop it to tell it to do it a different way or because it came across something I didn’t expect, like a different library I need to do something or maybe I thought about it and I want it to return a different data structure or something else.
If that change starts to be major or makes us drift too far from the plan then I will stop and loop back to my planning phase and refine it then run /fb:update-tasks to capture it into the plan as a new phase.
Or I might just stop it and write it differently myself to get it done faster if I know exactly what I want.
Because the plan has checkboxes, the agents know to check them off after various steps, or I just tell Claude “look at the last commits and update progress.”
I run commit all the time with WIP messages. The more commits the better. Forget clean commit histories. The histories are there as a backup so you can rollback or unfuck what an agent did. For major commits, then you can push a better message that makes it clear this is for a human to read.
Commit too slowly and you are asking for trouble because agents can code faster than you. So they may successfully fix files that you didn’t check in and then screw something else up later and you are left to untangle it. Worst case you need to roll back the commit and now you just undid good work too.
So commit early and often. Like after every change.
Don’t Trust. And Verify
I also have a philosophy on testing that works very well. My basic rule if I cannot see it happening on a console log, or in a log file, or in a database, or a GUI interaction then it did not happen.
I cannot tell you how many times one of my agents wrote a test that did not show some background message and the output looked good but it was totally and completely FUBAR.
As I was writing my Deep Research agent I notice a message that was like “I can’t see what I am researching so I will ask the user what they want” because it was losing context on what it did because of a bad reducer that clipped the previous messages and retained only the last message. But the test did not show anything wrong because the Langgraph was finishing and the output showed successful tool calls. Meanwhile the agent was just reading the snippets from Tavily search and declaring itself done and failing to do any further research or web page reading on subsequent loops that always went to the end.
I repeat, if you did not see it and you cannot confirm it, it did not happen.
Prove it. Prove it to the AI and to yourself. Otherwise expect problems because the LLM will happily tell you something is “working” because of a bullshit/incorrect test it wrote.
Agents.md? SubAgents? Magical prompt Templates? Agents 2.0 and other Bullshit?
I’ve just started to do parallel agents a little bit but mostly I feel the same as Karpathy here:
“For example, I don’t want an Agent that goes off for 20 minutes and comes back with 1,000 lines of code. I certainly don’t feel ready to supervise a team of 10 of them. I’d like to go in chunks that I can keep in my head, where an LLM explains the code that it is writing. I’d like it to prove to me that what it did is correct, I want it to pull the API docs and show me that it used things correctly.”
And forget subagents. You can’t see what they are doing. Claude backgrounds them. They are good only for research but don’t let them make changes. That said, threads/tabs in Cursor or multiple Claude code agents with a different worktree? That works.
I use agents in parallel when I have a big, mature, tested codebase. But it is entirely bottlenecked by my ability/fatigue level to monitor them.
Monitoring 10? No way. That’s X influencer nonsense.
Trying to run multiple agents at the beginning when you’re working out what the code needs to do is a nightmare and will only cause you horrible, horrible problems. But once you have a standard and a clear refactor or a documentation update to do, agents are good at doing that kind of boring work in parallel without you paying super close attention.
They even make refactoring fun! And refactoring has never been fun.
Agents.md or Claude.md file? I have them. Couldn’t care less. Barely update them. People think these files are some kind of black magic. They’re not. They work fine (kinda, except when the agent just ignores them anyway) and if you like them, keep using them. But you’ll find that most things people hold sacred with AI is akin to superstition and folk beliefs. Mostly it’s not real.
How about MCP servers?
I use exactly three.
sequential-thinking
All three are fantastic. Context7 and chrome-dev-tools are indispensable.
Context7 lets the model look up docs and dev tools exposes browser automation and screenshotting and Chrome’s developer streams to the the coding agent. If you are doing any work at all, you need context7 or the models will just make up types and methods that do not exist in a library.
If you do full stack or frontend work then Chrome Dev Tools makes Playwright look pathetic. It gives the agent the right context it needs to do strong work.
But I find most MCP servers to be utterly useless overkill. Frankly, I don’t like the protocol all that much and see it dying eventually. In 99% of the cases I don’t want my tool to run as a server. I can see why you would want it as a server. It decouples the code from the agent. You could have one MCP server in Go. Another in Rust. Another in Typescript.
Honestly, who gives a fuck? It’s annoying in practice and a nightmare in production.
My agents in production have a dozen tools or more, dynamically swapped in. I absolutely do not want to manage a dozen flaky-ass endpoints that could go down on me in production, especially if I’m vertically scaling them.
Much, much better to just write a simple tool with a cli interface and a good docstring to tell the agent how to use it. It can use `--help` and it’s all good. So make sure that cli help switch is, you know, helpful.
I actually hope Computer Use agents, that are undetectable, crush MCP servers and the pay-for-every-damn-service-via-newly-outrageous-API-pricing-that-used-to-be-free model dies a swift death. Just give me an agent that clicks around an interface with 99.9% accuracy. Or just give me a simple command line interface to an API or just the docs. That’s all you need. You probably don’t need an MCP server.
AI Slop, the Verification Problem, and AI Will Take All the Jobs (It won’t)
You’re probably wondering if the AI will take all the jobs in coding or if the code is any good?
The only people who think AI will take all the jobs are
terminally posting on X or
populist politicians or
Max Tegmark style doomers
sitting on the most clueless boardrooms in America and who think you just fire people and hire agents (all those stories about companies firing people and hiring “AI” are basically companies downsizing and saying “AI” to skirt other compliance rules. It’s not real in the way little green men on Mars are not real).
The reason AI can’t take any jobs now or anytime soon (if ever) is simple:
The verification problem.
What is the verification problem?
Simple.
If you’re already good at something, you can verify what the AI does. If you’re not, you can’t and you’re fucked and so is your company.
The hardest part of coding with AI (or doing anything at all with AI) is verification. If you’re using these systems for anything that matters, you need a verification pass that goes way beyond a lazy skim. That means detail-oriented human work — you must check every claim, every diagram, every link, every word, every line of code, every outcome and citation and fact when AI does something for you.
And who’s best positioned to verify?
The very people who are already good at whatever the AI is trying to do: the workers it’s supposed to replace.
Doctors can check medical claims. Senior programmers can check AI coding outputs. Strong copywriters can check that whatever GPT writes sings. They know a good turn of phrase when they read it and can make sure each paragraph flows from the one before it.
That’s the biggest irony of AI work. If you’re not already good at the task it’s doing, then you can’t tell if what it generates is good. You don’t have the knowledge or the context.
If you don’t know French, then you don’t know if a French translation sounds clunky or if you just told someone to eat shit in your new commercial because of new slang that sounds like the phrase you translated.
So I relate to the “learn to fucking code” posts on Hacker News. You should learn to code. Coding with AI made me want to learn coding and I did. So should you. Get better and better at it. Keep learning.
The irony is that despite writing very little of my code, I am a much better coder than I was a year ago. I’ve been studying coding like crazy, as much as three hours a day in certain stretches, just inhaling Youtube courses and O’Reilly books.
The more you know the easier it is to do something. Anyone who tells you “XYZ job is cooked” or “don’t bother learning this or that” or “don’t bother going to school” is a fucking idiot. Unfollow them immediately and never listen to another word that comes out of their mouth.
There is not a single thing in life that I regret learning. Everything I have ever learned has compounded and made me sharper and better at everything else I do.
A year ago I couldn’t write much (good) code other then some super complicated Bash scripts, which basically don’t count all that much. That doesn’t mean I never tried. Again, I started programming C/C++ in the 80s. And I’ve had various bursts of studying and coding along the way. I was just never very good. I was fantastic systems architect and sys admin but I just hated coding.
When I wrote The Joy and Pain of Coding with AI When you Suck at Coding back in April 2024, coding agents were awful in a thousand ways and it was like pulling teeth to get anything moderately passable written. The code I shared then is fucking embarrassing now. You can look at it and laugh. I do. It’s just utter dog shit that I would not ship today.
It’s also amazing that was a year ago! It feels like reading an article that I wrote twenty years ago. AI can help you learn in leaps and bounds.
That’s the journey of learning anything, of getting better over time. That’s Kung Fu. Hard work over time. And it’s amazing that my skills have leveled up so much in such a short period of time.
That’s what business necessity and pressure and three hours of intense code work a day will get you.
The harder you work and the more you have a real reason to learn it, the better you will get. AI coding is no different. Even if the machine is doing the grunt work, the more you know, the better you can make decisions. When you know what you’re doing, you know when the AI has just done something idiotic. If you can’t read or write code, you are up shit’s creak without a paddle.
But reject the “just write it all yourself” mentality. You don’t use a protractor when the calculator gets invented. Nobody uses a bow and arrow when they have an m4. the Mongols could shoot an arrow as fast as a bullet for short distances. You still want an m4.
If you insist on writing all your code by hand you will be in trouble in a few years when AI is doing work for most of your team. The new world is coming, ready or not. It will be as if you’re an analogue film editor, cutting film by hand and sneering at digital editing when computers came onto the scene. Eventually you’ll get left behind and be out of a job, not because AI took it, but because you decided scissors were all you need.
So keep learning. Keep adapting. Keep writing code, but get good with an AI workflow too.
The old world of writing every line of code is dying fast. We’re in the early adopter phase now and soon it will be early majority.
The new world is where sometimes you write the code, but mostly you don’t.
Maybe you think that sucks. It’s coming anyway, and any objections you have to it won’t stop it.
As Thomas Ptacek wrote in his awesome article “My AI Skeptic Friends are All Nuts“ there are no good objections that amount to rejecting the calculator for an abacus in your professional life.
But what about “you have no idea what the code is”:
To that he writes:
“Are you a vibe coding Youtuber? Can you not read code? If so: astute point. Otherwise: what the fuck is wrong with you? You’ve always been responsible for what you merge to
main. You were five years go. And you are tomorrow, whether or not you use an LLM. If you build something with an LLM that people will depend on, read the code. In fact, you’ll probably do more than that. You’ll spend 5-10 minutes knocking it back into your own style. LLMs are showing signs of adapting to local idiom, but we’re not there yet.”
About half the posts on Hacker News to Peter Steinberger’s great article on coding with AI are “this is all AI Slop” or “he wrote 300K lines that could have been written in 20k.”
To that I say few things:
These folks are likely thinking about the models from half a year ago. They tried it a few times, for a short time, and then put it aside and went back to their old process.
They think of code as art or craftsmanship.
They’re worried about cost/their job.
To the AI Slop I point you again to developer Thomas Ptacek’s article “My AI Skeptic Friends are All Nuts“:
“But the code is shitty, like that of a junior developer,” he writes “Part of being a senior developer is making less-able coders productive, be they fleshly or algebraic...Also: let’s stop kidding ourselves about how good our human first cuts really are.”
And he goes further to the question of “but the craft” he says “Do you like fine Japanese woodworking? All hand tools and sashimono joinery? Me too. Do it on your own time.”
“I have a basic wood shop in my basement. I could get a lot of satisfaction from building a table. And, if that table is a workbench or a grill table, sure, I’ll build it. But if I need, like, a table? For people to sit at? In my office? I buy a fucking table.
“Professional software developers are in the business of solving practical problems for people with code. We are not, in our day jobs, artisans...Nobody cares if the logic board traces are pleasingly routed. If anything we build endures, it won’t be because the codebase was beautiful.”
Be responsible for your code. But refusing to code with AI will get you nowhere.
Lucky for us these machines can’t do taste or critical thinking or long term context or know what the hell you want to do in the first place.
Lucky for us, the real thinking of apps and life is asking questions and doing the hard work to figure things out. What do I want? Why? What should it do? How should it do it? What’s a good way to do it? What am I missing here?
People are still here. Again, all the scary stories you read in the mass media about AI taking all the jobs is just CEOs in an economic downturn of tariffs and trade wars and post COVID tech booms to bust. They are saying things to get them around worker regulations for letting people go.
Don’t believe the hype. There is not a single machine that can do anyone’s complete job today. These machines are fantastic task-doers and that’s it.
But long term memory? Long term context? Learning new tools and skills on the fly? Adapting? Updating priorities and project timelines? Forget it.
Most people don’t know what they want or how to articulate it. And what we want changes as we build something. You thought you wanted it this way, but discovered a better way as you worked on it. That is the process of creation. That is the process of taste and refinement and engineering and life! It’s iterative.
So jump in and embrace the calculator and leave your abacus at the door.
Find your workflow. Find your flow. And just go with it.
Keep coding. Keep learning. Turn the AI off sometimes and just code for the pure joy of it on a private project. Read a new O’Reilly book. Take a course. Always keep learning.
But use AI. Use it well.
A new life awaits you in the offworld colonies.












It’s extraordinary how near-exact our workflows are down to the meta-prompts. 😳
Buddy, I really love your insights and texts. Congrats! Don't know if you are familiar with this open source project: BMad Method - AI Agile Software Engineering -https://github.com/bmad-code-org/BMAD-METHOD . It is a legitimate, effective and totally aligned way of crafting software with context engineering and human in the loop, based on the points you argued in this post of yours. Give it a try and check for yourself. All the best!