When the buzz about AI’s latest code competencies started to surface earlier this year, I dismissed it as the next round of hype. AI writes 90% of Claude’s code – yeah, sure. But then Claude Opus 4.5 and GPT-5.2 Codex hit the 80%+ on the SWE-bench Verified.

And then the story hit the NY Times. In The AI Disruption We’ve Been Waiting for Has Arrived, Paul Ford, co-founder and president of Aboard, writes, “For weeks, software engineers have been sounding off on social media, expressing awe and dread about what they are seeing AI systems do. Skills that took them a lifetime to develop can be completed with relative ease, speeding up the process of coding to a shocking degree.”
Early impressions of code generators
Reading Ford’s comments on AI code competencies was an “Oh Shit” moment for me.
Just eight years ago, I asked, “Can I learn to code?” I argued no, that coding required too much craft.
My experiments with coding tools two years ago were meh, and I wrote about the experience in redesigning this blog, Drive, with genAI’s help. Oh, I got plenty of AI’s help writing code, and then it needed a lot more of mine to debug it.
AI coding starts to wow me
Then last year, while writing an InfoWorld article on vibe coding, I applied the same test I use on low-code platforms. Can it deliver something of value in 30 minutes or less?
Replit past this test, in building the app I requested – or at least the scaffolding for the app, because that’s all I could tell visually as it developed the codebase. I had no idea what was inside, and then, when it asked for my OpenAI keys, I halted, not knowing what the work would cost me or whether I had just inadvertently shared some of my intellectual property with Replit’s AI.
Then, over the last month, I was able to get some very useful code. Claude built a web tool to generate a slide index for a PowerPoint file and listened to my restrictions, as I didn’t want my intellectual property uploaded to any AI models. It responded with a locally hosted app that made no external calls.
This week, I asked Microsoft Copilot to develop an MS Word macro that performs word counts for my document’s header sections. It delivered the code but couldn’t deploy it, and instead walked me through the steps to add the macro and connect it to the Ribbon.
My takeaway: I can’t move into a house that’s only 80% built. I still need the architects, contractors, and trades to oversee and then finish the job. That said, I am still concerned about whether engineers are prepared for the emerging world of agentic AI software development, as I wrote less than a year ago. With every round of my tests, I am seeing AI perform better at coding.
I haven’t played around enough with the latest version of Claude Code and OpenAI Codex to perform more significant work. Until then, I asked several experts what has awed them and what they dread about AI’s new coding competencies. All five were awed, but it’s what they dread that’s worth considering before you or your teams dive into AI coding without release-readiness checks.
Developers become numb to the permissions they dole out
Dr. Bachman, professor at the Claremont Colleges, in his blog post, Now Anyone Can Build Software, writes, “Once Claude or Codex writes some code, it can run the code and check for anything undesirable. It can then make revisions and repeat, autonomously, for long periods of time.”
Bachman is a dear friend of mine from Binghamton University, where I took his chaos theory class, and we pondered life over beers at The Ale House. His article is a great read for non-coders, as he walks you through his steps and shares insights on using AI to develop code.
But Bachman has some reservations.
“By default, agents are only given local access: they can only read and write files in the folder that you open. When an agent wants to run a system command, which could affect things outside the local folder, they’ll show you the command and ask for your permission. In practice, you end up having to sit there and hit the “approve” button every few seconds. So you are also given safety level options: when it runs a system command, you can mark it as ‘safe,’ so that if that same command comes up again, the agent shouldn’t bother you,” says Bachman.
Dreadful concern #1: How many times do you accept the Terms of Service without reading them? I expect developers to do the same when granting permissions to an AI code generator, which could create security and data privacy risks.
Code fast, but diagnosing and debugging are costly
Even a small tool may require a developer to look under the hood and put on a debugging hat. In the following example, it’s a very seasoned architect.
“What’s amazed me is how quickly AI can generate working code,” says Tyler Johnson, founder of a stealth data and AI governance startup. “I recently built an AI-driven automation workflow for call transcripts where the core code took about 20 minutes with AI assistance, but nearly 20 hours to debug dependency conflicts across open-source libraries and AI tooling. It also made it clear that I needed to be much crisper with the requirements up front, because when they weren’t clear, it led to unnecessary iterations.”
Dreadful concern #2: Developers – and most people – are notoriously bad at measuring time or qualifying complexities. So what may look like “fast code” to a development manager or business leader may actually have hours of a senior developer’s or architect’s handiwork behind it.

Do you trust the code?
So a developer vibe codes an app and wants to push it to a demo environment for end-user validation. You’re the CTO. Do you approve the pull request unquestioningly?
“It’s amazing how fast AI coding tools have shifted to support fully agentic workflows, enabling developers to generate weeks’ worth of code in a matter of hours,” says Itamar Friedman, co-founder and CEO at Qodo. “But I’m worried about the rise of AI-powered code gen at the enterprise level, as teams optimize generation speed without rigorous reviews that understand context and organization standards, ultimately leading to technical errors and security risks. Governance is critical here. We need to ensure AI gen code is reviewed, tested, and maintainable, so speed doesn’t come at the cost of trust or production reliability.”
Dreadful concern #3: We end up with one AI writing code, another evaluating it, and no human in the middle to challenge the underlying assumptions.
AI’s black box developed simply, but delivered complexity
So I can have AI code. I can use other AI tools to evaluate the code. I can use a third set of AI tools to query the organization’s codebase. But I also know, historically, it’s very difficult to get developers to take over other people’s code and make changes, refactor, or holistically re-engineer it.
So will AI-generated applications become black boxes that no one – or no AI can maintain?
“I am awed by how a simple natural language prompt can now trigger a fleet of autonomous agents to plan, prototype, and deploy a production-ready application in minutes, effectively turning a single architect into a full-stack squad,” says Simon Margolis, associate CTO of AI and ML at Insight. “My dread isn’t that these tools replace developers, but that teams might use them to ‘vibe code’ without understanding the underlying logic, creating a future where we are maintaining complex systems whose origins we can’t fully explain. We must treat these agents as powerful pair programmers that amplify our strategic intent rather than just black boxes that spit out shortcuts.”
Dreadful concern #4: We all know that maintaining and extending applications are more complex and cost more than developing them. I haven’t seen tests or benchmarks of AI’s abilities to maintain a codebase generated by AI. But here is one article arguing that AI-generated code costs more to maintain than human-written code, and another countering it, declaring that AI-generated code might be more maintainable than you think.
What if “code candy” corrodes engineering disciplines
Building a house is more than pouring concrete, layering bricks, and hanging sheet rock.
So what about all the disciplines required for developing, testing, and deploying applications and AI agents? AI isn’t just coding; it can also help write requirements, maintain documentation, and simplify some multicloud complexities. The 2026 Agentic Coding Trends Report by Anthropic states, “Engineering teams discovered that AI can now handle entire implementation workflows.”
But for software development, there aren’t detailed regulations like those in finance, building codes like those in construction, or HIPAA data privacy requirements like those in healthcare. So while AIs can take over tasks of the job, I’m not sure what the underlying coding standards, DevSecOps non-negotiables, and nonfunctional requirements the AI follows.
“I have been awed by how violently fast the feedback loop has become – tools like Copilot, Codex, and Claude Code can compress days of senior-level work into hours if you know how to frame the problem,” says Johnny Halife, CTO at Southworks. “But the hidden blast radius is something I dread. Getting these systems production-ready requires real security and cost discipline, and frontier models are still erratic in token-usage predictability, which can interrupt work mid-flow in ways no traditional developer tool ever did.”
Dreadful concern #5: Observability, functional testing, performance testing, security best practices, data validation, finops, and managing technical debt are all disciplines human software developers are aware of and ideally develop best practices around. What about AI?
–
All that said about my dreadful concerns, but my biggest one is that AIs will address them over time, turning a discipline I loved earlier in my career into this generation’s keypunch cards.
























Leave a Reply