No bad vibes

Be wary when AI starts coding

Apr 25, 2025

A few weeks ago, I was having dinner with a good friend I’ve known for more than two decades. His background is in business and software development, but he subsequently moved into a variety of management and executive roles. Neither of us, I would say, has written any code in probably close to a quarter of a century, but we have retained a lively interest in technology and the trends shaping the industry.

Over a couple of good steaks and an excellent wine, my friend told me he was, out of pure professional interest, doing some coding again—but with the help of generative AI. Just for fun, he wanted to create a simple game and have the code produced via prompting ChatGPT. He described the iterative process he went through—refining his prompts to add more detail to the game, improve the user experience and, of course, correct errors. He was very pleased with the results, and rightly so. Generative AI made him a more productive programmer, absolving him of the need to write detailed, syntactically correct lines of code and enabling him to focus on the high-level function and results that he wanted to achieve. I suggested that his success might be partly due to the fact that he already had professional experience in writing software, and he agreed—he knew the principles of software design and what to expect from AI, and would be able to make corrections if his personal project headed in the wrong direction.

Feel the vibe
Around the same time as that dinner conversation, my news feed began to pop with many stories covering a concept with the New Age sounding name of vibe coding. The idea is pretty much just what my friend had done. You prompt generative AI with the “vibe” of what you want to achieve with your program and let it write the code for you. Then you tell it what you liked, didn’t like, what didn’t work, and it refines the code. Eventually, you have a working program. Sounds great? Well, yes and no. Let’s explore the pros and cons of this new programming paradigm.

Dario Amodei, CEO of Anthropic, the maker of the Claude generative AI platform, went on the record in March 2025 to say that, in three to six months, AI will write 90 per cent of the code for software engineers and that this will rise to 100 per cent within a year. He suggested further that although software engineers will still provide the design specifications to AI, even this work will eventually be taken over by AI. Vibe coding, it would seem, is the new normal; if Amodei is correct, even the vibes will be generated by the machine.

But let’s take a step back for a moment and look at vibe coding in context. I wrote about software development a couple of years ago. Although the objective of that post was to discuss how quantum programming differs from classical, I also covered some of the evolution of programming tools and techniques over the last few decades. The story of software development, in a nutshell, is one of increasing abstraction and reusability, constantly elevating the programmer’s work away from manually writing lines of code. Vibe coding, I’m afraid, might be a step too far along this path.

Early in my career, when I was teaching programming courses for IBM’s ill-fated OS/2 operating system, I made the students do all their lab exercises by editing files of code written in the C programming language. I figured this would foster a deeper understanding of how the APIs[1] really worked and how they relate to the architecture of the operating system itself. My classes were consistently very highly rated, so I think it worked. But eventually, one of the first visual development tools was introduced: Easel for OS/2 allowed programmers to assemble their applications by dragging and dropping components, and the tool would generate the code. I added an optional extra module to my courses to cover this tool but did not drop any of the fundamental instruction. Using the visual tool did not excuse the students from first understanding the programming concepts and system architecture.

Now, of course, visual development tools are the norm. Using anything from the open-source Eclipse for Java, to Microsoft’s Visual Studio and many others, software architects and developers are more productive than ever. The main point of the tools, though, is just to simplify the coding process. By pointing and clicking, developers could define variables, create subroutines and classes, and even compile and test their code. The tool would generate syntactically correct code but the programmer was still very much a hands-on participant, needing to understand the details as well as the big picture of how and where the program would execute and interact with other software.

APIs and IPAs
I would argue that vibe coding, like the related process of building AI agents, is very different from using visual software tools. Its promise is to democratize software development beyond the elite cadre of professional programmers, allowing anyone without a background in technology to design and build software for themselves. But software engineering is a specialized skill, and Michael F. Buckley, a blogger on the platform Medium, warns that the industry should brace itself for a deluge of “AI-assisted garbage made by people who don’t know an API from an IPA.” Sarcasm aside, his point is that an amateur attempting to write an application using vibe coding won’t have the expertise to understand the code generated, if it will work properly and can be integrated into other enterprise systems, and if it is even safe to use.

Now, in principle I’m all for democratization and a do-it-yourself approach, but I also think there should be constraints and guardrails. For example, I used to do some of my own home renovation work and got pretty good at basic carpentry and painting; I could even wire a light switch or solder a copper pipe if I had to. Beyond that, I knew my limits and gladly engaged a trained handyman to do anything more than the easy stuff. From construction and auto mechanics to flying an airplane or computer programming, we can do some things for ourselves but, for most of the complicated work, we rely on trained professionals.

Personally, I would rather sit back and consume the IPA while a professional software developer works with the API.

What could possibly go wrong?
Let’s look at the risks pointed out by Buckley.

One is, just like for AI agents, organizations’ already-taxed IT departments will be burdened with deploying, securing and integrating all kinds of user or AI-generated applications. Normally, software development goes through several well-defined phases from conception and design to multiple iterations of coding and debugging, to integration testing, stress testing and user-acceptance before being deployed to production. All these steps are necessary to ensure that the software works as intended and doesn’t cause any new vulnerabilities in the overall IT infrastructure. Even then, it’s normal for multiple rounds of patches to be applied between version releases—or for one program to expose hidden bugs in other, related applications. Any new piece of code normally goes through a lengthy deployment process before it is released and usable.

Therefore, somebody needs to understand the code and its documentation—both internal and external—and this will be a problem if the code is vibed with AI. Understanding code written by someone else is notoriously difficult, and software frequently outlives its authors—the Y2K issue from the late 1990s serves as a stark warning of what can happen when “legacy” code is forgotten but suddenly needs to be fixed. Who will know how to fix vibed code?

Well, if AI wrote the code, why don’t we ask AI to debug it? Microsoft researchers have put together a tool called debug-gym which is intended to enable generative AI systems to debug any software source code, including code it didn’t generate by itself. Samuel Axon, a senior editor at Ars Technica, describes the results. Three leading generative AI systems, Claude, ChatGPT-o1 and ChatGPT-o3, were tested both with and without using debug-gym. Claude raised its success rate in finding and fixing bugs from 37.2 per cent to 48.4 per cent, while the two versions of ChatGPT scored much lower—going from 10.7 per cent to 30.2 per cent and 8.5 per cent to 22.1 per cent respectively. Human software engineers, naturally, do considerably better. A debugging tool that is less than 50 per cent effective at best is, as Axon writes, clearly “not ready for prime time.”

Why does generative AI fail at debugging? Axon notes that the Microsoft researchers indicate a “scarcity of data representing sequential decision-making behaviour … in the current LLM training corpus.” In other words, LLMs are not built to reason their way through a step-by-step process. Next steps in Microsoft’s research will be to try to build a smaller, custom model that can serve up specialized data to the larger model. I’ll wait and see how that works out, but I suspect that the inherently probabilistic nature of LLMs might limit this approach altogether.

Debugging is important, too, because it turns out that vibe coding isn’t particularly good at creating anything more than the most basic programs. David Ramel writes in Visual Studio magazine of his experience trying vibe coding using Microsoft’s tools. Using ChatGPT and Copilot, he was able to create a working to-do list app from scratch without having to write a single line of code. He then went on to try something slightly more complicated, creating a weather app using the same tools. And that, Ramel reports, is where “things went to hell. I spent hours and hours in an error-generating rabbit hole … with all fixes generating new errors.”

Programming, by its nature, is procedural. AI may need to get a lot less generative before it can get good at work that demands accuracy and attention to detail. We haven’t looked yet at AI hallucinations and how they might affect vibe coding.

I enjoy reading The Register for its often tongue-in-cheek and frequently sardonic perspective on the IT industry. Naturally, a recent article by Thomas Claburn containing the term slopsquatting in its subtitle, caught my eye. Hallucinated AI output is often called slop, and typosquatting is a known attack method for inserting malware into someone else’s software. Now the two have come together.

A common practice in software development is using code from publicly available packages that can be downloaded from online registries. It’s prevalent in Python but also used in many other programming languages. The programmer simply has to add an “include” statement in the code, referencing a package name, and all its functions become available for use in the application. Of course, accuracy is important. If there’s a typo in the package name, the “include” statement should fail. Except typosquatters have created packages of malware, named like standard packages but using common typos, and uploaded them to the registries. If the programmer isn’t careful, they might include the malware with potentially disastrous consequences.

According to Claburn, when generative AI is used for vibe coding, it hallucinates included package names anywhere from five per cent to nearly 22 per cent of the time. Worse, the hallucinations often follow patterns where the same nonexistent package names are used repeatedly. Claburn credits Seth Michael Larson of the Python Software Foundation with coining slopsquatting—where malicious actors simply upload packages of malware using commonly hallucinated names, which the vibe coding generative AI happily includes while the human programmer is none the wiser.

Peace out
As my friend experienced, vibe coding can be useful if you manage—or limit—your expectations. If you want to write a game or simple app, it can help as long as you know what you’re doing with it. If you want to do a mockup or prototype something quickly, vibe coding can also be useful for that. But if you don’t check the work carefully, you may be in for a surprise.

Vibe coding does not appear to be ready for use in enterprise software development. The productivity gained by automatically writing lines of code is more than lost in the extra effort required to integrate and maintain it and manage the security risks. I think Dario Amodei was wrong, and human enterprise software developers will still be around for a long time to come.

I wish nothing but the best of vibes to all—but please, do your own coding.

[1] APIs are Application Programming Interfaces, services provided by the operating system or other applications that can be used by your program. They can do anything from opening and reading files, to rendering windows on your screen, to managing the execution of other programs or accessing network services.

Kelvin Landolt

Apr 26

If we think of the body as AI and the task - swimming to the end of the pool - in many ways, humans first attempts at swimming down the pool are 'vibe swimming' - an inexperience computer attempting to execute a program without any ability to correct the coding. Just like your conclusion in the article... I wish nothing but the best of vibes when swimming - but please... see you for some pool-side code debugging on Monday 9 AM - Best, Coach Kelvin

Expand full comment

1 reply by Feite Kraay

Margret tuer

As usual a very interesting article. let's hope the human brain will always come out top. I see from this mornings globe they now have AI working in the mental health field. There is a helmet that you can put on to rewire your brain out of depression! It all sounds a bit dangerous to me. What next?

1 more comment...

Emergent technology

Discussion about this post