AI Coding Agents Need XP

7 min read

Agentic Coding Needs a Spine

29.05.2026, By Stephan Schwab

Agentic coding did not make old software habits obsolete. It made them more visible. When a machine can produce code at absurd speed, the disciplines that generations of developers learned the hard way matter even more: small steps, fast feedback, clear responsibilities, and one source of truth for each rule. The agent is new. The physics are not.

If you do not work in software day to day, the core problem is simple enough: the machine writes code quickly, but quick is not the same thing as coherent. Plenty of software fails for the same reason a badly organized kitchen fails. Knives in one drawer, forks in another, spices everywhere except where cooking happens, and three half-open bags of rice because nobody knew there was already rice in the house. The work continues. The waste multiplies.

Software gets that same chaos when anyone, human or machine, copies the same rule into five files, invents a new class for every whim, or refactors code just because it feels intellectually itchy. The result may still run. It just becomes expensive to change.

XP Still Works Because Physics Still Works

"Faster code generation does not repeal feedback loops."

Extreme Programming was built for a world where software changes constantly and developers are wrong several times a day. That world did not go away because a large language model learned to autocomplete whole files.

The XP habits that mattered in 2000 matter even more with agents in 2026:

Write a failing test first.
Make the smallest change that turns red into green.
Refactor only while tests stay green.
Integrate constantly instead of saving surprises for later.

None of this is glamorous. Good practice rarely is. Good practice mostly looks like refusing to lie to yourself.

Agentic coding raises the stakes because the model can produce a lot of plausible nonsense before a human has even finished their coffee. That is why tests beat instructions. The habit matters because the feedback loop matters. If you remove the loop, you are left with speed and hope, which is how expensive mistakes get mass-produced.

DRY Means One Rule, One Home

"Duplication is not just ugly. It is how one bug becomes a subscription plan."

DRY gets oversimplified into a slogan about reducing keystrokes. That misses the point. The point is not typing less. The point is making sure one business rule does not quietly fork into several competing versions.

If your discount logic lives in a controller, a background job, a checkout service, and a reporting script, you do not have reuse. You have four future disagreements waiting for a release weekend.

This is exactly the kind of mess agents create when discipline is weak. The model sees similar code in two places and happily pastes a third version because that solves the immediate task. It is not malicious. It is opportunistic. Fast output without design pressure always drifts toward duplication.

That is why the old rule still matters:

Reuse an existing domain object or helper before creating a new one.
If a business rule appears twice, extract it before the task is considered done.
Do not copy validation, pricing, authorization, or mapping logic across modules.

That is the human job when working with an agent. Not romanticizing craftsmanship while accepting avoidable duplication. The old habit still wins: keep the rule in one place, then defend that place every time the code changes.

For readers outside day-to-day software work, DRY is just this: when a company policy changes, you want one place to update it, not seven places to forget it.

OOP Is About Responsibility, Not Inheritance Cosplay

"Good objects own behavior. Bad objects are storage bins with a fake mustache."

Object-oriented programming also gets abused by people who confuse vocabulary with design. A class hierarchy is not architecture. It is often just a longer route to regret.

Good OOP, especially in agentic coding, means a few plain things:

Each object or module should have one clear reason to change.
Behavior should live close to the data it depends on.
Public interfaces should stay small.
Composition should win over inheritance unless inheritance is obviously simpler.

That matters because agents love inventing abstractions. Give them a vague prompt and they will gladly produce BaseManagerFactoryAdapter as if satire were a design pattern.

The habit that keeps this sane is older than the current wave of tools: prefer small objects with clear responsibilities, extend an existing abstraction only when the surrounding code already uses that pattern, and do not introduce a new layer unless the current one is failing.

The practical test is brutally simple: if a human developer cannot explain where a rule lives and why, the agent will not keep it coherent either.

TDD Gives the Agent a Metronome

"Without red, green means nothing."

A lot of teams say they want the model to do TDD when what they actually mean is, “please also write some tests so this looks respectable.” That is not TDD. That is paperwork with assertions.

If you want an agent to behave like a disciplined developer, the development loop has to stay intact instead of turning tests into decorative paperwork.

The loop is old and still undefeated:

Read the existing tests and code before changing anything.
Add or update one failing test that describes the next behavior.
Implement the smallest change that makes that test pass.
Run the relevant tests.
Refactor only after green.

This gives the model a rhythm. It narrows the search space. It turns the task from “write something impressive” into “satisfy this specific contract without breaking the rest.” Models need that discipline even more than humans do, because they are very good at sounding correct while being structurally wrong.

If you want the broader case against freestyle prompting, vibe coding is not software development. It is improvisation with better autocomplete.

What Good Agent Instructions Actually Say

"Tell the model how to work. Let tests and tools decide whether it succeeded."

Most bad instruction files share one flaw: they try to predict every coding decision in prose. That is unwinnable. The file swells, contradictions appear, and the agent cherry-picks whatever token pattern looks easiest in context.

The better approach is smaller and harsher. Your instruction file should mostly cover:

Workflow: test first, smallest change, green before refactor.
Design guardrails: keep code DRY, prefer small objects, reuse existing abstractions.
Safety rules: do not weaken tests, do not invent dependencies casually, ask when requirements are ambiguous.
Completion rules: tests pass, lint passes, diff stays focused.

That is enough to shape behavior without pretending markdown can replace judgment.

In other words, instruction files are for habits. Tests are for proof. Linters are for consistency. Keep each tool doing its own job.

A Better Prompting Habit for Teams and Solo Developers

"The best agent prompt is often a one-line task plus a repo that already tells the truth."

There is a tempting fantasy around agentic coding: write a clever master prompt once, then let the machine carry the craft for you. That fantasy survives because the first few demos usually work.

Then reality arrives. The model duplicates logic. It changes an interface nobody meant to touch. It adds a helper that almost matches the existing helper but not quite. Suddenly the team is writing prompt law instead of software.

The cure is not more drama in the prompt. The cure is a repo that tells the truth:

Tests explain what the system must do.
Instruction files explain how the agent should behave while changing it.
Existing code shows what patterns are already alive.

Once those three line up, prompting gets simpler. You can say, “Add VAT handling for Swiss invoices,” and trust the agent to enter the codebase through the front door instead of smashing a window.

That matters for solo developers too. A solo repo can still become a swamp. In fact it often becomes a swamp faster, because there is nobody else around to complain before the reeds are shoulder-high.

Hosted Example Files

The examples below are intentionally short. They do not try to serialize the entire craft of software development into markdown. They establish discipline, then hand enforcement back to tests and tooling.

Steal them. Trim them. Make them fit your stack. But keep the center of gravity where it belongs: tiny feedback loops, clear responsibilities, and less duplicated stupidity.

Talk It Through

Tell me what is happening. I listen, ask a few practical questions, and reflect back what I see: where the risk may sit, what may be blocking delivery, and what looks worth checking next. No pitch, no obligation. Confidential and direct.

Talk it through. Practical reflection, no pitch.

Start a Conversation

🎭 This Week's Episodes

Telenovelas show what we can't say in client meetings. The drama is heightened, but the patterns are real.

The Run Tells the Truth

Ethan Carter turns the harness into a habit: every change runs the batch in daylight, and the Cucumber scenarios refuse to lie. Nathan Cole feels s...

The Production Readiness Review

With the staging environment stable and the database replicas synchronized, the team prepares for the final production readiness review of the tour...

Technical Consultancy

Embedded Delivery Partner

Embedded into your team as an active contributor, reducing delivery friction and helping important work move cleanly.

Technical Advisor

Peer-style technical assessments before far-reaching decisions; reduce architectural and product risk early.

Product & Delivery

Ship working software to real users earlier. Measure impact and adapt based on evidence rather than assumption.

Custom Software Development

High quality, maintainable software. Short-term augmentation that leaves lasting capability in your own team.

AI Coding Agents Need XP

Agentic Coding Needs a Spine

XP Still Works Because Physics Still Works

DRY Means One Rule, One Home

OOP Is About Responsibility, Not Inheritance Cosplay

TDD Gives the Agent a Metronome

What Good Agent Instructions Actually Say

A Better Prompting Habit for Teams and Solo Developers

Hosted Example Files

Talk It Through

🎭 This Week's Episodes

Technical Consultancy

Embedded Delivery Partner

Technical Advisor

Product & Delivery

Custom Software Development

Recent Articles

Explore More

AI Coding Agents Need XP

Agentic Coding Needs a Spine

XP Still Works Because Physics Still Works

DRY Means One Rule, One Home

OOP Is About Responsibility, Not Inheritance Cosplay

TDD Gives the Agent a Metronome

What Good Agent Instructions Actually Say

A Better Prompting Habit for Teams and Solo Developers

Hosted Example Files

Talk It Through

Related Articles

Agentic Coding Needs an Air Traffic Control Shift Model

AI Needs Developers, Not Just Data Scientists

Git Worktrees Are Scratch Lanes for Coding Agents

Newsletter

🎭 This Week's Episodes

Technical Consultancy

Embedded Delivery Partner

Technical Advisor

Product & Delivery

Custom Software Development

Recent Articles

Explore More