How Software Developers Fail

“Software developers fail in two ways: they build the thing wrong, or they build the wrong thing.”

These two separate failure modes are worth exploring independently, because they have different root causes and very different consequences. And Agentic AI “developers” face the same challenges, but can do both much more rapidly than their human counterparts.

The Two Ways Developers Fail

1. Building the Thing Wrong

Building the thing wrong means the technical execution is flawed. The requirements may have been understood correctly, but the implementation was flawed.

Some examples of building the thing wrong:

Bugs that cause incorrect behavior
Poor performance that makes the software unusable
Security vulnerabilities that put users at risk
Fragile architecture that makes the system difficult to maintain or extend
Code so complex or tangled that no one can safely change it

These are the failures we most commonly talk about in software engineering. Tools like code reviews, automated testing, static analysis, and CI/CD pipelines all exist primarily to help catch and prevent these kinds of failures.

2. Building the Wrong Thing

Building the wrong thing means the technical execution might be flawless, but the software doesn’t solve the right problem. The product might work without technical issues. It compiles. It runs without errors.

However, there are signs that you’ve built the wrong thing:

It solves a problem nobody actually has
It solves a problem in a way nobody actually wants
It addresses yesterday’s problem, not today’s
It delivers features users asked for, but not what they actually needed
It makes incorrect assumptions about how the solution should be implemented

This is subtler and often more costly. You can have a beautifully crafted, bug-free, performant application that delivers zero value because it was solving the wrong problem all along. All that effort - wasted.

The entire agile movement, lean startup methodology, and practices like user story mapping and continuous customer feedback exist largely to combat this second failure mode. And unfortunately, it’s not as easy to automate. There isn’t a compiler or unit test suite or linter that will tell you your software is what the customer wanted. Practices like Acceptance Test Driven Development (ATDD) can certainly help, but they generally require buy-in and effort that extends well beyond the development team.

Why This Distinction Matters

Building the thing wrong is often an individual skill issue, or perhaps a process issue for the development team. These issues are often easy to detect and as such may be relatively quick to mitigate in many cases. With a sufficiently well-designed process, even extremely junior level developers can contribute to the application with little risk of obvious technical bugs making their way through reviews and test suites and other gates and ultimately into production.

The more dangerous failure is successfully shipping something that isn’t needed. Or sometimes wanted. This can make the development team feel a false sense of success, which then translates into even larger disappointment when faced with the fact that they need to roll the changes back or make major updates to them. Almost always this is a failure in communication, not programmer skill, and the effort required to correct it is usually larger, ongoing (you can’t just add another build step), and likely involves multiple teams within the organization. Scrum and XP both expect a domain expert representing the customer to be literally on the team with the developers, but it’s extremely rare for organizations to do this on a literal, fulltime basis. More likely someone in a separate department from the dev team has been given an extra role as “Product Owner” or “Domain Expert” so that developers can reach out with questions and maybe meet with periodically, which is hardly the same thing as a fulltime team member.

The Age of Agentic Development

AI coding agents are transforming how software gets built. Tools like GitHub Copilot, Cursor, Claude Code, and others can now write, test, refactor, and deploy code with minimal human direction. The promise is enormous: dramatically higher developer productivity, reduced toil, faster iteration.

More code, faster.

But here’s the critical insight: AI agents don’t change which failure modes exist - they amplify both of them.

Agents Are Very Good at Building Things Wrong (at Scale)

AI agents can generate large volumes of code quickly. But they can also generate large volumes of buggy code quickly. They can:

Introduce subtle security vulnerabilities that are hard to spot in a code review
Write code that looks correct but has edge case failures
Overfit to the stated requirement while ignoring implied constraints
Confidently produce incorrect output (“hallucinations”)

The speed advantage of AI agents means that if the quality controls aren’t in place, you can accumulate much more technical debt and many more bugs in the same amount of time. You need automated tests, code review (human and automated), and robust CI/CD pipelines more than ever. Shipping fast is only an advantage if what you’re shipping works.

A lot of the current research teams are doing as they evaluate whether and how AI agents fit into their models has to do with adding guard rails to prevent the problems listed above. Adding more tests. Adding skills and markdown instructions telling the agent the code must compile, tests must pass. Creating separate agents for writing, testing, reviewing, and more. And a lot of these efforts can be effective, though often at the cost of many more expensive tokens.

Agents Are Extremely Good at Building the Wrong Thing (Confidently)

This is where things get really dangerous. AI agents are optimized to follow instructions. They will build whatever you ask them to build - quickly and confidently. If you give an agent a brief, poorly written or misaligned specification, it will execute on that specification with remarkable diligence. It may not push back or ask clarifying questions (unless you design the workflow to include that). It won’t notice that the feature you asked for contradicts one that already exists. It won’t ask whether users actually want this. In some cases it won’t even notice if the feature you asked for already exists - it will happily build another similar version of it!

An AI agent given a vague or wrong specification will deliver a vague or wrong product - just faster than a human developer would have.

This is why the skills that help teams build the right thing matter more than ever:

Clear requirements and user stories - Garbage in, garbage out. Most agents are designed to keep going until they feel they have solved the problem. If they don’t have all of the details, they’ll make assumptions. They’ll guess. Just like human programmers must often do, but without the human’s judgment and at much greater speed.
Ongoing stakeholder feedback - Agents can’t attend user interviews or stand-ups. Humans still need to gather and synthesize feedback and adjust direction.
Domain expertise - Understanding why a feature is needed helps catch misalignments before they’re baked into thousands of lines of generated code.
Product thinking - The ability to ask “should we build this at all?” is (today) a uniquely human skill, and it’s increasingly valuable.

The Compounding Effect

With human developers, the cost of building the wrong thing is bounded by human speed. A team of five developers working for a sprint might produce 10-15 features. If those features are wrong, the waste is manageable.

With AI agents, a small team can now produce 10-15 features per day. If those features are wrong, the waste - and the cleanup - scales accordingly. Worse, a large codebase full of wrong features creates compounding complexity: future agents (and humans) must work around all the wrong things that were built before. Agents are great a following existing patterns: a large codebase full of wrong features will be used as the model for the next feature.

The leverage that makes AI agents so powerful when pointed in the right direction makes them dangerous when pointed in the wrong direction.

What This Means for Teams

Agentic development doesn’t eliminate the need for good engineering practices - it raises the stakes for them. All of the things that increased the quality of outcomes for human software engineers do the same thing for AI agents. And they’re more important than ever.

Invest heavily in requirements and specifications. The clearer and more precise your inputs to AI agents, the better the outputs. Vague requirements that a human developer might naturally interpret reasonably will be implemented literally - or worse, inventively - by an agent.

Automate quality gates. AI-generated code needs automated tests, linting, security scanning, and other quality checks. These gates become the primary defense against building things wrong at scale. They’re also much easier to build out with the help of AI agents, so there’s little excuse not to set them up from the start.

Keep humans in the loop on product decisions. Agents can accelerate execution, but they can’t replace judgment about what to build. Product managers, architects, and senior developers need to stay closely involved in defining and validating the direction.

Review and validate frequently. Short feedback loops matter even more with agents. Set up your process to move in small increments with frequent human review steps to ensure development remains on track.

Measure outcomes, not just output. The temptation with AI agents is to measure how much code was shipped. But what matters is whether users are getting value. Track usage, user satisfaction, and business outcomes alongside velocity metrics.

Optimize downstream processes. Today, AI agents can produce code ten times as quickly as human software engineers. But producing code probably wasn’t the bottleneck before, and it certainly isn’t now. Increasing the production of non-bottlenecks in the system just results in more waste and slower delivery overall. Before unleashing your AI agents on your full product backlog, make sure you’ve built quality gates, automatic deployment checks, test environments where new features can be evaluated and signed off by humans, etc.

Conclusion

The two failure modes of software development - building the thing wrong and building the wrong thing - have existed as long as software itself. They’re not new problems. But agentic AI development amplifies both, making the discipline of capturing good requirements, continuous feedback, and automated quality assurance more important than ever.

The developers and teams who thrive in the age of AI agents won’t just be the ones who adopt agents earliest or most aggressively. They’ll be the ones who combine the speed of AI with the judgment, domain knowledge, and customer focus that ensures they’re building the right thing, the right way.

Moving fast only matters if you’re moving in the right direction. Like the classic quote: “We’re lost, but we’re making good time!”

Did you find this helpful? Let me know on BlueSky or subscribe to my newsletter for more content like this!