RC RANDOM CHAOS

EY Canada's 2026 report cited papers that don't exist

EY Canada published a cybersecurity report with mostly hallucinated citations. Here's what that means for how you should read threat intelligence.

· 7 min read

The report

In early 2026, EY Canada published a cybersecurity report aimed at executives and policy readers. A researcher who checked the footnotes found that most of the citations did not resolve to real sources. Some pointed to papers that do not exist. Some named real authors attached to titles those authors never wrote. Some URLs returned 404s, and a few led to unrelated content. The report was pulled and reissued, but the original was already in circulation, already screenshotted, already quoted in at least one secondary briefing.

This is not a story about EY being uniquely careless. It is a story about what happens when a generative model writes the bibliography and no one with subject-matter expertise reads it before publication. The mechanics matter, because the same mechanics are operating inside thousands of other reports right now, and most of them will not be audited.

What hallucinated citations actually are

A large language model trained on text can produce a citation that looks correct in every surface feature: author surname, plausible journal, a year, a volume number, a DOI-shaped string. The model is not retrieving a record from a database. It is generating tokens that fit the statistical pattern of “academic citation.” The output is a forgery in the literal sense - a thing made to resemble a record without being one.

The failure mode is well-documented. Stanford’s RegLab found that legal AI tools hallucinated cited cases between 17 and 33 percent of the time, depending on the product. A 2024 audit of medical literature summaries from general-purpose models showed citation error rates between 28 and 91 percent. None of this is secret. Anyone building a workflow around LLM-generated reports has had the chance to read the literature on this specific failure for at least two years.

What is new is the institutional confidence: a Big Four firm putting its name on output that was clearly not verified.

Why threat intelligence is the worst place for this

Most professional writing tolerates some bibliographic drift. A misattributed quote in a marketing whitepaper is embarrassing. A misattributed quote in a threat report is operational poison.

Threat intelligence has a specific epistemic shape. A reader uses it to make decisions: whether to fund a control, whether to add a detection, whether to brief a board, whether to escalate to a regulator. Each of those decisions sits on a chain of citations. Vendor A cites a CERT advisory. The CERT advisory cites a vendor B blog post. The blog post cites a researcher’s GitHub. The GitHub repo references a CVE. If any link in that chain is invented, the decision is being made against a phantom.

Threat intelligence already has a credibility problem. Attribution is hard. Sample sizes are small. Vendors have commercial reasons to overstate severity. The whole field runs on trust in the citation graph - on the assumption that when one report says “as documented by X,” you can go read X. Hallucinated citations break the graph. They do not just add noise; they make the noise indistinguishable from signal.

The specific harms

Three concrete failure modes follow from a polluted citation graph.

First, false consensus. A claim that originates in a hallucinated source can be picked up by a second report, then a third. By the fourth citation, the original phantom is no longer visible - what remains is a chain of real reports citing each other, each one assuming the next has verified the chain. The defender community has watched this happen with attribution claims about specific APT groups, where a single shaky inference became conventional wisdom over five years.

Second, misallocated defense. If a report claims a specific TTP is rising in frequency, citing a source that does not exist, the SOC team that builds detections around that TTP is now spending budget on a fiction. The detection itself may be useful - TTPs are real even when citations are fake - but the prioritization is wrong. Resources go to the loudest hallucination, not the actual threat.

Third, regulator exposure. Reports like EY’s get cited in regulatory filings, in board packets, in cyber insurance underwriting. A claim built on a fabricated citation, repeated to a regulator, is a misstatement. Whether that misstatement becomes a legal problem depends on jurisdiction and on whether anyone gets hurt, but the exposure is now baked into the document trail.

What broke inside EY’s process

We can reason about the process from the artifact, without insider information.

A report with hallucinated citations went through some number of review steps without anyone clicking a link. That tells you the review was checking for format, tone, brand voice, and the absence of obvious legal liability. It was not checking for factual accuracy at the source level. Either the reviewers assumed the author had verified citations, or the author assumed the reviewers would, or both assumed the model had.

This is a workflow problem with a well-understood shape. When a tool produces output that is locally plausible but globally unverifiable, organizations that do not build verification into the workflow will ship unverified output. The default state of a generated document is unchecked. Verification has to be an active step with a named owner.

The fix is not “tell people to check citations.” People will not check 80 citations by hand under deadline pressure. The fix is a pipeline step that resolves every cited URL, every DOI, every CVE identifier, every author-title pair against a real index, and flags anything that does not resolve. That tooling exists. It is not expensive to build. Its absence in a Big Four cybersecurity report is the story.

What readers should do differently now

If you consume threat intelligence - vendor reports, analyst briefings, government advisories, consulting whitepapers - your trust model needs to update. Three changes are worth making.

Resolve the citation before you cite the report. If you are about to repeat a claim from a report in your own briefing, open the cited source. If it does not exist, the claim does not exist. This is unglamorous and it is the only reliable filter.

Weight reports by their evidentiary base, not their brand. A two-page advisory from a SOC team with packet captures and IOCs is more reliable than a 40-page report from a consulting firm with no primary observations. Brand prestige is not a substitute for evidence. The EY report had the brand. It did not have the verification.

Keep a list of reports you have personally checked and found clean. That list is your provenance set. When you cite from it, you are citing something you have validated. Everything else is hearsay until you check it.

What publishers should do

If your organization publishes anything that someone might cite - research notes, blog posts, advisory bulletins, customer briefings - you need a citation-resolution step in your publishing pipeline. The minimum viable version is a script that takes the manuscript, extracts every URL, DOI, CVE, and author-title pair, and resolves each one. URLs that 404 get flagged. DOIs that do not resolve get flagged. CVE IDs that do not exist in NVD get flagged. Author-title pairs get checked against Google Scholar, Semantic Scholar, or the relevant domain index.

This catches almost all hallucinated citations. It does not catch citations that are real but misrepresented - a paper that exists but does not say what the report claims it says. That second class of error needs a human reviewer with subject-matter knowledge, and there is no shortcut. But the first class - pure fabrication - is mechanically detectable, and any organization shipping AI-assisted research without that step is choosing not to catch it.

Name the person who owns citation integrity. If no one owns it, no one does it. This is the same lesson Evan Francen has been writing for years about security ownership generally: a function without a named owner is a function that does not happen.

The structural problem

The deeper issue is that generative models have lowered the cost of producing plausible-looking research to near zero, while the cost of verifying that research has stayed the same. That asymmetry is going to widen. More reports will be produced. Fewer will be checked. The signal-to-noise ratio in public threat intelligence will get worse before it gets better, and the institutions best positioned to fix it - large firms with budgets and reputations - are the same institutions whose incentives push toward volume over verification.

The EY incident is useful precisely because it is embarrassing enough to force a conversation. The next one will be quieter. A regional firm, a sector-specific advisory, a government bulletin - all of them are running on the same workflow defaults, and most of them will not be audited. The reports will be cited anyway, and the citation graph will keep degrading.

The practical response is to treat unverified citations as a category of compromised intelligence. Not malicious, not necessarily wrong, but not yet trustworthy. Build your reading habits around that distinction. The reports that survive verification become your working set. Everything else is rumor with a logo on it.

Share

Keep Reading

Stay in the loop

New writing delivered when it's ready. No schedule, no spam.