Why does AI make up court cases that don't exist?

Because it writes a citation the same way it writes prose: one token at a time, predicting the most plausible next character. A case name plus a reporter citation is a low-redundancy string in a shape the model has seen tens of thousands of times, so it completes the pattern fluently whether or not that opinion was ever decided. The citation is generated, not retrieved from any copy of the reporter.

How do I check whether an AI's case citation is real?

Open it. A real Tax Court opinion resolves to a real docket and reporter citation; a fabricated one resolves to nothing. Then read the holding and confirm the case actually says what the model claims it says. A case that does not exist cannot produce a real opinion to read.

If a case is real, is it safe to rely on?

Not necessarily. A real opinion can be cited for a holding it never reached, or it can have been reversed or overruled since it was decided. Confirm both the holding and the current treatment before you rely on it. A citation graph can show how often an opinion has been cited since, a signal of influence and currency, not an authoritative validity check.

Does taxmcp.io include Tax Court case law?

Yes, on the Pro+ plan. taxmcp.io indexes 49,000+ United States Tax Court opinions dating back to 1942. The get-case tool returns full opinion text by citation or name, and get-citing-cases shows how many later Tax Court opinions cite a given case.

Why AI Invents Fake Tax Court Cases (And How to Cite Real Ones)

In 2023, a New York lawyer filed a federal brief that cited Varghese v. China Southern Airlines Co., 925 F.3d 1339 (11th Cir. 2019). It was a strong case for his client. It was also entirely made up. ChatGPT had written it: the name, the reporter citation, the quotations, the judges. Opposing counsel went looking for the opinion and found nothing. A federal judge fined the lawyer and his firm $5,000, and Mata v. Avianca became the cautionary tale every litigator now knows by name.

A fabricated Tax Court case is the same object, and it breaks the same way.

If you research with a general-purpose model, you have probably already been handed cases like Varghese. You just may not have checked. The danger is not that the model is dumb. It is that the fake reads exactly like the real thing. The model knows the shape of a citation, not the docket behind it. For a CPA or tax attorney about to put a case in a memo, that gap is where the malpractice lives.

A case citation is a string the model finishes

A language model writes one token at a time, choosing whatever is most likely to come next. That is the entire mechanism, and it is the same one that makes the model invent Internal Revenue Code sections.

Renkemeyer, Campbell & Weaver, LLP v. Commissioner, 136 T.C. 137 (2011) is a near-random string in a shape the model has seen tens of thousands of times: party v. Commissioner, [volume] T.C. [page] ([year]). It has learned the shape perfectly. What it cannot learn from pattern alone is which specific names, volumes, and pages point to opinions that were actually decided.

So when your question calls for a case, the model produces the most plausible continuation of the pattern. Plausible and real are different properties, and it only optimizes for the first.

The model is not remembering a case and getting a detail wrong. It is assembling a citation that has never corresponded to an opinion. It was generated, not retrieved.

A real case can be just as dangerous

When Stanford researchers asked leading models specific, checkable questions about federal cases, the models fabricated an answer 69 to 88 percent of the time. But an invented case is only the most obvious failure. Case law adds two quieter ones.

Citation	Real?	What is actually there
Varghese v. China Southern Airlines Co., 925 F.3d 1339 (11th Cir. 2019)	No	Invented by ChatGPT and filed in Mata v. Avianca; the opinion, the quotes, and the cite never existed
Renkemeyer, Campbell & Weaver, LLP v. Commissioner, 136 T.C. 137 (2011)	Yes	Holds that partners’ shares from performing services are not shielded from self-employment tax by §1402(a)(13)
Soroban Capital Partners LP v. Commissioner, 161 T.C. 310 (2023)	Yes	Adopts a functional-analysis test for the limited-partner exception

The first one does not exist. The other two do, and they are a click from the real text, if your tool can find them.

The two quieter failures both involve real cases. A real case, cited for a holding it never reached. The model attaches Renkemeyer to a proposition the opinion does not support, because the name fits the topic. A real case that is no longer good law. The model has no sense of what happened after the decision date, so it will cite an opinion that was later reversed or overruled as if it still controls. A citation that checks out as real can still be wrong.

There is also a wrinkle unique to the Tax Court: a regular “T.C.” opinion carries more precedential weight than a “T.C. Memo.” decision. The model cites both in the same confident register, with no signal that one is worth more than the other.

Before you cite

Until retrieval is part of your workflow, four checks stand between a fabricated case and a client deliverable.

Open the citation. A real opinion resolves to a real docket and reporter citation; a fabricated one resolves to nothing. Thirty seconds beats an afternoon walking back a position.
Read the holding, not the model’s summary. A model can describe a real case as standing for something it never held. Pull the opinion and confirm it actually says what you are about to claim.
Confirm it is still good law. A 2011 opinion can be undercut by 2014. Citation activity hints at currency; the subsequent history settles it.
Keep drafting and authority in separate lanes. Let the model write the memo. Do not let it be the source of record for what the courts have held.

These steps work. They also put the burden on you to catch every fabrication, on every citation, every time.

Cite from a corpus you can open

The fix is the same one that kills fake Code sections: change where the citation comes from. If the case is retrieved from an index of real opinions and handed to the model, rather than generated by it, a Varghese can never appear, because it is not in the source to copy.

That is what Tax Court case law on Pro+ is for. taxmcp.io indexes more than 49,000 United States Tax Court opinions going back to 1942. Ask for a case by name or citation and get-case returns the actual opinion text with a citation you can open. Want to know whether an opinion still carries weight? get-citing-cases shows how many later Tax Court decisions cite it, and which ones it relies on.

The citation graph shows influence and currency, not validity. You still confirm good law before you stake a position. The tool gets you to the real opinion fast; your judgment takes over from there.

Give it a real one to read from

The lawyers in Mata were not reckless. They trusted a fluent paragraph the way you would trust a colleague, when the thing producing it was a pattern-completer that had never opened a reporter. By late 2025, a public database of these incidents had logged hundreds of court filings built on cases that did not exist.

Tax practice runs on the same trust and the same exposure. The model knows the shape of a citation, not the docket behind it. Give it a real one to read from.

Add Tax Court case law to your research with Pro+ →

Or start with the bigger question: can you trust AI for tax research at all?