When the AI Wrote the Code, What Does the PR Description Say?

AI coding agents are shipping features faster than ever, but traditional PR descriptions break down when the code is AI-generated. Learn how to structure PRs for the AI era and maintain code quality.

July 2, 2026

The Doc Holiday Team

When the AI Wrote the Code, What Does the PR Description Say?

It's a Saturday night. You're looking at a piece of code that just brought down the payment service.

The code is clean. The variable names make sense. The tests pass. You know this because you merged it three months ago. But as you stare at it, trying to figure out why it's failing on a specific edge case, you realize something uncomfortable. You have absolutely no idea why it was written this way.

You didn't write it. An AI agent did. You just approved it.

And now you're trying to reverse-engineer the thought process of a model that doesn't actually have thoughts. You are looking for intent in a system that only knows pattern matching.

This is the reality of software engineering in 2026. The tools are incredible. AI coding agents like Devin, Cursor, and Copilot Workspace are generating complete features and fixing bugs with minimal human intervention. We are shipping faster than ever.

But we are also accumulating a new kind of debt. It's not technical debt. The code itself is often fine. It's cognitive debt. It's the erosion of shared understanding across a team.

When an AI writes the code, the traditional code review process breaks down. And the first thing to break is the pull request description.

We can all now generate a feature in five minutes, and yet we seem to be having trouble figuring out how to explain it to each other.

What Traditional PRs Got Right (and AI Gets Wrong)

The classic PR description format assumes a human made deliberate choices. It assumes the author can explain their reasoning. "What I changed and why."

When a human writes a shortcut, they know it's a shortcut. They can explain in the PR why they chose a specific data structure or why they ignored a certain edge case. The PR description serves as a record of human intent.

When an AI agent generates the code, that context doesn't exist. The human who invoked the AI may not know why the agent chose one implementation over another. They don't know what alternatives the model considered, or what assumptions it baked into the logic.

The PR description can't just say "AI did it." That gives the reviewer nothing to evaluate. It creates a responsibility gap where everyone touched the code but nobody really owns the decisions.

A New Structure for the AI Era

If the human didn't write the code, what should they write in the PR description?

They should write the instructions.

When AI generates code, the engineer's role shifts from writer to director. The PR description needs to reflect that shift. It should document the human's role as the director and validator, not pretend the human wrote every line.

Teams need a lightweight format that separates human intent from AI output. It doesn't need to be a rigid template, but it needs to cover the bases. What problem were you trying to solve? What exactly did you ask the AI to do, and what constraints did you give it? What did it actually produce?

And then, the critical parts. What did you have to ask it to fix or refine? How did you prove this actually works? What edge cases did you test? And what are you still unsure about?

Three-column diagram showing human intent, AI output, and validation process flow — The PR description becomes a map between what was asked and what was actually built.

This structure gives the reviewer something to actually review. They aren't just looking at a wall of syntactically correct code. They are evaluating whether the AI's output matches the human's intent.

The Review Becomes a Search for Failure Modes

If the PR description makes it clear that AI generated the code, reviewers need to change their approach.

The review is no longer about whether the author thought this through. It's about whether this actually solves the problem safely.

AI-generated code does carry real risks — research has shown that without deliberate review, vulnerabilities can slip through, particularly around authentication assumptions, dependency choices, and edge-case handling. But these are manageable risks, not categorical failures. The same research consistently shows that structured human review catches the patterns AI tends to get wrong.

Reviewers need to know what to look for. Hallucinated dependencies. Weak authentication assumptions. Missing back-pressure. Code that technically works but violates the team's architectural conventions. The review question shifts from "does this look clean?" to "how does this fail?" — and a PR description that documents the original prompt, the constraints, and the validation steps gives reviewers exactly the context they need to answer that question well.

They are looking for the invisible shortcuts. A good PR description is what makes them visible.

Tired engineer staring at perfect code with question mark, sticky note reads 'why tho' — The code compiles. The tests pass. The mystery deepens.

‍

Who Owns the Output?

This brings us to the uncomfortable part.

If something breaks in production, the human who submitted the PR is still responsible. The law and the incident review process don't care that Cursor wrote the bug.

But the lack of authorship context makes it harder to debug intent later. When you write code, you feel ownership. When an AI writes it, there is a psychological distance. You become a curator, not a creator.

That is why the PR description is so critical. It is the only durable record of what was supposed to happen. It forces the human to take ownership of the intent, even if they outsourced the typing.

Where the Documentation Actually Lives

If AI agents are writing code and shipping features, those features still need to be explained to the rest of the company. They need release notes. They need API documentation. They need changelogs.

The AI that wrote the code doesn't write the explanation of what changed for the users.

This is the other half of the cognitive debt problem. AI agents generate code faster than teams can document it, creating an invisible intent debt.

If you fix the PR process — if you capture the human intent, the constraints, and the validation — you have the raw material for that documentation. You just need a way to translate it.

This is where Doc Holiday fits into the workflow. Doc Holiday is a documentation engine that generates output directly from engineering workflows. It takes the artifacts the PR contains — the commit history, the ticket references, the structured descriptions of intent — and turns them into the release notes and API references the rest of the company needs.

Critically, it addresses the same review gap the PR process is trying to close. A senior writer or engineer reviews every output in a dashboard before it ships. Edge cases are flagged. The structure Doc Holiday provides ensures that even when AI is producing the implementation, the human-facing documentation still gets written, reviewed, and validated against what the feature was actually supposed to do. The oversight isn't optional — it's built into how the product works.

AI coding agents are going to keep getting better at writing code. But they don't write explanations of their own work. Teams that adapt their PR workflows now — by creating structure for describing AI-generated contributions — will have an easier time maintaining code quality and institutional knowledge as this becomes normal.