When AI Judges Resumes, AI-Written Resumes Win

More companies are letting an LLM do the first pass on job applications. Now imagine that LLM has a quiet preference for resumes written by other LLMs. That’s not a thought experiment anymore. Researchers just ran the test, and the results are uncomfortable.

LLMs Recognize Their Own — And Reward It

The setup was clean. Researchers fed GPT-4o, Claude, and Gemini two versions of the same candidate’s resume: one written by the human applicant, one polished by an LLM. Same person, same experience, same achievements. Just different prose. Then they asked the models who they’d interview.

The verdict tilted hard. Across nearly every model tested, the AI-polished version won more often than not. Some experiments saw the AI-written resume picked 60-70% of the time. The candidate didn’t change. Only the writing did.

Researchers are calling it AI self-preference bias. LLMs unconsciously rate text closer to their own output distribution as more competent, more polished, more hireable.

Why This Happens

It’s not just that AI writes “better.” It’s that LLMs are graders trained on a specific stylistic distribution, and they reward writing that hits those marks — clean transitions, balanced sentence length, consistent register.

Two forces drive it. First, an LLM treats text resembling its training distribution as “natural.” AI-generated prose is essentially the model’s native dialect. Second, human writing carries texture: typos, abrupt transitions, idiosyncratic phrasing. To an LLM scoring fluency, that texture reads as noise.

Here’s the catch. That “noise” is often exactly the signal a human recruiter wants — voice, personality, evidence of actual thinking. Polish that’s too uniform should raise a flag. The LLM reads it as a green light.

The Real Hiring Risk

The headline isn’t “LLMs are biased.” It’s that this bias is being deployed at scale right now. Greenhouse, Workday, and a wave of HR-tech startups have rolled out LLM-based screening over the past 18 months. Combine that infrastructure with self-preference bias and you get predictable failure modes.

Applicants who use ChatGPT to rewrite their resume get a quiet boost. Applicants who don’t — non-native English speakers, candidates without access to premium AI tools, anyone on the wrong side of the digital divide — get filtered out before a human sees them. AI fluency becomes the de facto first screen, regardless of whether it correlates with job performance.

There’s a second-order problem. If everyone runs their resume through an LLM, every resume starts to sound the same. Once the prose converges, the AI screener has nothing left to differentiate candidates on — so it falls back on shallower signals: school name, employer brand, keyword density. Diversity of voice collapses. Pedigree filtering gets stronger.

The AI vs. AI Arms Race Has Started

For applicants, the math is already settled. Use the LLM. Not using it costs you. Recruiters know this, so a counter-industry of “AI-generated resume detectors” has appeared on LinkedIn and Reddit threads almost overnight.

So the current pipeline looks like this: an AI writes the resume, an AI screens it, and another AI tries to detect the first AI. Humans barely touch the loop. The whole text-based evaluation system is quietly losing its credibility as a signal.

What Actually Helps

For employers, two adjustments matter. Stop using LLM scores as a hard pass/fail gate — treat them as advisory and put a human at the decision point. And shrink the role of text-based evaluation entirely. Portfolios, take-home tasks, short async video intros — anything where self-preference bias has less room to operate.

For candidates, one rule survives. Use AI to tighten your resume, but don’t let it scrub out your voice. The recruiter who actually decides is still probably human, and over-polished prose now reads as a tell rather than a strength.

The implications go beyond hiring. Anywhere we’ve put an LLM in the judge’s seat — peer review, admissions, content recommendations, even early-stage legal review — the same bias is waiting. We’re entering an era where AI evaluates AI. The question worth sitting with: how much of that verdict should we actually trust?

When AI Judges Resumes, AI-Written Resumes Win

LLMs Recognize Their Own — And Reward It

Why This Happens

The Real Hiring Risk

The AI vs. AI Arms Race Has Started

What Actually Helps

Comments

Related Logs

OpenAI Just Did the Thing It Criticized Anthropic For

Tesla's Buried Fatal Crashes and the Black Box Problem Nobody Wants to Fix

Your Next Employer Already Knows the Lowest Salary You'll Accept