The AI That Fixes Code You Didn't Ask It To: The Hidden Cost of Over-Editing
“I asked it to change one line. Three files came back modified.” If you’ve spent any time on Hacker News or r/ExperiencedDevs lately, you’ve seen this complaint. Claude Code, Cursor, Copilot — they all do it. Developers now have a name for the behavior: over-editing. And the tools marketed as productivity multipliers are, for a growing chorus of engineers, just shifting the work from writing code to policing diffs.
What over-editing actually looks like
Over-editing is exactly what it sounds like: the agent does more than you asked. You request a one-line bug fix. You get renamed variables, a helper function split in two, re-sorted imports, and freshly rewritten comments you never wanted.
On the surface, this looks like a conscientious colleague. In practice, it’s noise that eats your reviewer’s time. Mix intentional changes with cosmetic ones in a single diff and the reviewer has to re-verify every line. The PR balloons. The chance of a regression slipping through climbs with it. The pitch was “AI does the work for you.” The reality is often “I now re-audit everything AI produced.”
Why LLMs can’t keep their hands off
The root cause is baked into training. LLMs learned from an enormous corpus of “good code,” and somewhere in that process they absorbed an improvement bias — a reflex to clean up anything that looks improvable. Chain-of-thought reasoning amplifies it: the model is rewarded for taking a broad view of the problem, which translates directly into touching more files than strictly necessary.
Mario Zechner of Mastra hit this head-on in his March 31 video, “I Hated Every Coding Agent, So I Built My Own” — 159,000 views and 4,837 likes, which tells you how raw this nerve is. His argument, stripped down: existing coding agents make too many decisions on your behalf and call it helpfulness.
The three hidden costs
Review cost. A 10-line diff becomes a 200-line diff. Review time scales roughly linearly, and reviewer attention does not.
Token cost. When the agent reads and edits files it was never asked to touch, input and output tokens explode. Nate Herk’s April 2 video “18 Claude Code Token Hacks in 18 Minutes” pulled 189,000 views and 6,784 likes — a clean proxy for how token-conscious the community has become. Over-editing is one of the biggest offenders in that bill.
Trust cost. The more code the agent touched, the more a developer wonders what landmine is buried inside the diff. Every PR becomes a suspicion exercise.
How to train an agent to edit surgically
The only reliable fix is explicit constraint. “Only change this function, don’t touch anything else” in a prompt helps, but agents break that rule routinely. Patterns that developers are reporting actual success with:
- Pin a rule like “Surgical edits only. No unrequested refactors.” at the top of the prompt, or hard-code it into
CLAUDE.mdor.cursorrules. - Scope by line range, not file. “Modify L42–L55 only” leaves less room for creative expansion than “fix the function in
auth.py.” - Break work into smaller units. The Hidden Layer’s April 12 video introduced the idea of Atomic Skills — don’t hand the agent a sprawling task; decompose it into the smallest coherent operations and run them one at a time. It’s a structural way to suppress over-editing rather than begging the model to behave.
The real question is delegation, not intelligence
The over-editing debate isn’t really about whether AI is smart. It’s about how much judgment you’re willing to hand over. Delegate broadly and you get surprises in your diff. Delegate narrowly and the automation barely pays for itself. Finding that line — per task, per codebase, per teammate — is turning into the defining craft skill of working with AI agents in 2026.
So how much discretion are you giving your coding agent? And how much of your week is spent undoing what it decided to fix on its own?
Comments
Loading comments...