By Thomson Reuters Institute’s 2026 count, generative-AI use among tax and accounting professionals has roughly doubled in a year, and most firms now either use it or plan to. That makes “can I trust AI for tax research” an incomplete question, because in practice you are probably already using it. The more useful question is narrower: trust it to do what?
Because the answer splits cleanly. There is one job AI does better than almost any tool on your desk, and one job it is structurally unfit for. Most of the trouble in practice comes from not noticing where the line between them falls.
Two jobs wearing one interface
When you ask a model a tax question, it does two very different things inside a single fluent paragraph. It performs language work: framing the issue, explaining a rule in plain English, organizing an argument, drafting the prose. And it performs authority work: telling you which Code section governs, what it says, and where the number comes from.
On screen these are indistinguishable. They arrive in the same confident voice, in the same paragraph, often in the same sentence. But they are not the same act, and they do not carry the same reliability.
The language work is genuinely excellent. A frontier model will sharpen a vague client question into a precise legal one and explain a gnarly provision more clearly than most treatises. That is what next-token prediction is built for.
The authority work is where it breaks. When Stanford researchers tested leading models on verifiable legal questions, they fabricated an answer between 58 and 88 percent of the time. A tax citation fails the same way. Ask about the qualified business income deduction and you may get a confident reference to §199A(d)(4)(C), except that §199A(d) stops at paragraph (3), so the subsection does not exist. The model did not misremember it. It generated it, the same way it generates prose, as the most plausible continuation of a citation-shaped string. (The full mechanism is its own piece.)
The trap is that both halves wear the same face. The explanation can be flawless and the citation underneath it invented, with no change in tone to warn you.
Where it is reliable, and where it is not
Sort the work by which job it actually is, and the line gets practical:
| Task | Can you trust it? |
|---|---|
| Explaining a concept or rule in plain language | Yes; it is language work |
| Drafting or restructuring a memo or client letter | Yes; you own the facts and the conclusion |
| Summarizing a document you supplied | Yes; it works from your text, not its memory |
| Brainstorming issues or positions to research | Yes, as a starting list, not a final one |
| Telling you which section governs | No; verify against primary source |
| Quoting statutory or regulatory text | No; confirm it word-for-word at the source |
| Stating a threshold, dollar amount, or effective date | No; these are low-redundancy facts it invents fluently |
The pattern is one line: trust the model with the words, and verify everything that has to be exactly right. Risk also climbs with depth: a reference to §162 is usually safe, while §162(a)(2)(B)(iv) is exactly the kind of deep string a model is most tempted to assemble.
What “grounded” changes
The reason the authority work fails is simple: a general-purpose model has no copy of the Code to consult. It is not looking anything up. So a sterner prompt is not the durable fix. Telling the model to “only cite real sections” changes its tone, not the mechanism. The durable fix is to change where the citation comes from.
When a tool retrieves the section number and quoted text from primary authority and hands them to the model, rather than letting the model generate them, the failure mode disappears at the root. A fabricated §199A(d)(4)(C) cannot appear in a grounded answer, because it is not in the source to copy. The model goes back to doing the language work it is excellent at, and a retrieval layer supplies the parts that must be exact, with a URL you can open.
The thing that separates a usable AI tax tool from a risky one is simple: whether its citations are retrieved or generated, and whether you can open them to check.
That distinction is the axis the whole landscape of AI tools for tax professionals should be read along.
A three-question trust test
Before you rely on any AI tax answer, whether from a chatbot, a platform, or a purpose-built assistant, run it through three questions:
- Where did the citation come from: retrieved or generated? If the tool cannot tell you it looked the section up in primary authority, assume it generated it.
- Can you open it? A real section resolves to a real page on uscode.house.gov or the eCFR. A fabricated one resolves to nothing, or to text plainly about something else. No link, no claim.
- Does the quoted text actually say that? A model can paraphrase a fake section into something that sounds right. It is far less able to produce verbatim language that matches a source that does not exist. When the quote and the cite disagree, trust neither.
Ten seconds per citation beats an hour walking back a memo, and it is the same discipline whether you are drafting freehand or working inside a grounded tool. The full version of this, applied end-to-end, is the division of labor in an AI-drafted research memo: the model writes, primary authority decides what the law says, and you keep the judgment.
So, can you trust it?
Yes, conditionally, and the condition is the whole answer. Trust AI as a writer and an explainer; it is better at that than the tools you replaced with it. Do not trust it as the source of record for what the law is, unless it is grounded in primary authority and returns a link you can open.
The attorneys sanctioned for filing briefs with invented case citations were not careless people. By late 2025 a public database of these incidents had logged hundreds of filings built on authority that did not exist. They trusted a fluent paragraph the way they would trust a knowledgeable colleague, not realizing the thing producing it was a pattern-completer with no copy of the reporter. Tax practice runs on the same trust and the same exposure.
So the goal is straightforward. Keep using the model for what it does well, and give it a grounded source of authority for the parts that have to be exact.
This article is general information, not tax advice. Capabilities of AI tools change quickly; confirm any citation against primary source before relying on it for client work, and test any tool against the questions above.