arXiv 3 min read

arXiv Draws a Line: Hallucinated Citations Will Cost You a Year

Imagine chasing down a citation in a paper, only to discover the referenced study doesn’t exist. Not misquoted — doesn’t exist. Welcome to academic publishing in the LLM era, where fabricated references have become common enough that arXiv, the world’s largest preprint server, is now threatening one-year submission bans for authors who include them.

Why arXiv Is Finally Swinging Hard

arXiv is where physics, CS, and math researchers post work before formal peer review. It handles more than 20,000 submissions a month, and since 2023 a growing share of them have been LLM-assisted. The byproduct: bibliographies sprinkled with citations that look impeccable but lead nowhere.

Researchers call these hallucinated citations. The format is flawless — “Smith et al. 2021, Journal of XYZ” — but the paper, the journal issue, sometimes the journal itself, simply isn’t real. ChatGPT is doing to bibliographies what it’s always done to facts: confabulating with confidence.

What a One-Year Ban Actually Costs

Until now, arXiv’s stance was hands-off. Use AI if you want, just take responsibility. That posture has hardened. Authors caught submitting papers with fabricated references will face up to a one-year block on new submissions.

Preprint servers exist because peer review is slow and research moves fast. Being locked out of arXiv for twelve months effectively means disappearing from the conversation for a year — a serious career hit in fields where citation count and timing matter. This isn’t a slap on the wrist. It’s structural.

The Verification Burden Snaps Back to Humans

The message is blunt: use the tools, but stop copy-pasting their output into your bibliography. Citation verification has always been table stakes for a researcher. The convenience of generated prose was quietly eroding that habit.

Medical journals have been sounding the alarm for a while. One audit of GPT-generated medical abstracts found that roughly 40 to 60 percent of the references were inaccurate or fictitious. arXiv’s policy is the first major infrastructure-level response to a problem the literature has been documenting for two years.

Will Other Platforms Follow?

Almost certainly. Nature and Science already require AI-use disclosure. Some conferences outright reject AI-generated submissions. But arXiv is the first to single out a specific failure mode — the non-existent citation — and attach a real penalty to it.

The interesting question is enforcement. No human team can manually verify tens of thousands of bibliographies a month. Expect automated citation-checking pipelines to spin up fast — AI catching the messes AI made, which is becoming a recurring theme in 2026.

The New Line, and Where It Leaves You

Banning AI from research isn’t on the table. LLMs are already woven into literature review, drafting, and debugging. arXiv’s compromise is reasonable: use whatever tools you want, but the buck stops with the author.

If you’re a researcher — or honestly, anyone publishing AI-assisted writing — there’s a question worth sitting with. When did you last manually verify a fact, a number, or a source that a model handed you? Convenience compounds. So do its bills.

arXiv AI academic publishing hallucination research ethics

Comments

    Loading comments...