Why Using ChatGPT for Legal Advice Could Cost You Your Case
Attorneys have been sanctioned, clients have lost credibility, and courts have issued formal warnings. Here is why relying on ChatGPT for legal citations is one of the riskiest decisions you can make in litigation.
ChatGPT is a remarkable piece of technology. It can write essays, summarise complex topics, generate code, and hold conversations that feel genuinely intelligent. But when it comes to legal research — finding real cases, citing real authorities, and providing advice that you can rely on in court — it is not just unreliable. It is dangerous.
The problem is not that ChatGPT is stupid. The problem is that it is confidently wrong. And in the legal system, where a single fabricated citation can result in sanctions, dismissed claims, and destroyed credibility, confident wrongness is worse than ignorance.
The Hallucination Problem
Large language models like ChatGPT do not look up information in a database. They generate text by predicting the next most likely word in a sequence, based on patterns learned from their training data. This means they do not retrieve case law — they construct text that looks like case law. The distinction is critical.
When you ask ChatGPT to cite cases supporting a legal proposition, it does not search a case database. It generates case names, citations, and summaries that are statistically plausible — the kind of thing that would exist if the legal system worked the way its training data suggests. Sometimes it gets lucky and produces a real case. Often, it does not.
The result is what AI researchers call "hallucination": output that is fluent, well-structured, and entirely fabricated. ChatGPT will give you a case name that sounds legitimate, a citation in the correct format for the jurisdiction, a date that seems plausible, and a summary of the holding that perfectly supports your argument. Everything about it looks right. But the case does not exist. The citation leads nowhere. The holding was never made by any court.
This is not a bug that OpenAI will fix in the next update. It is a fundamental feature of how statistical text generation works. The model has no concept of truth, no access to a verified legal database, and no way to distinguish between a real case and one it has invented. It cannot tell you which cases are real because it does not know the difference.
Real Consequences — Court Sanctions for AI-Generated Citations
The risks of relying on ChatGPT for legal citations are not theoretical. They have played out in courtrooms, with real consequences for real lawyers and their clients.
Mata v. Avianca, Inc. (2023, S.D.N.Y.). This is the case that brought AI hallucination in legal research to international attention. Attorney Steven Schwartz used ChatGPT to research a personal injury claim against the airline Avianca. He submitted a brief to the Southern District of New York containing citations to six cases. None of them existed. When opposing counsel could not locate the cases, and when the court directed Mr Schwartz to provide copies, he went back to ChatGPT and asked it to confirm the cases were real. ChatGPT assured him they were. He submitted that confirmation to the court as well.
Judge P. Kevin Castel was not impressed. In his sanctions order, he noted that the attorney had "abandoned his responsibilities" by relying on a tool he did not understand and failing to verify its output. Mr Schwartz and his colleague Peter LoDuca were sanctioned $5,000 and required to notify each judge who had been cited the fabricated cases. The reputational damage was incalculable — the case was covered by every major news outlet worldwide.
"Technological advance requires that lawyers using such tools verify the accuracy of the information they provide to the court. Many have already learned this lesson, but the court addresses those who have not."
— Judge P. Kevin Castel, Mata v. Avianca, Inc., S.D.N.Y. (2023)
Park v. Kim (2024, D. Colo.). Less than a year later, another attorney was sanctioned in the District of Colorado for submitting AI-generated fictitious citations. The court found that the lawyer had relied on an AI tool to draft legal arguments and had failed to verify whether the cited authorities actually existed. The pattern was identical to Mata: plausible-sounding cases, correct citation format, completely fabricated holdings.
Regulatory warnings. The Solicitors Regulation Authority (SRA) in the United Kingdom has issued formal guidance warning solicitors about their obligations when using AI tools. The SRA has made clear that the duty to verify legal authorities rests with the solicitor, not with any technology, and that submitting unverified AI-generated content to a court or tribunal may constitute a breach of professional obligations. Similar warnings have been issued by bar associations in the United States, Australia, and Hong Kong.
These cases are not anomalies. They are the visible tip of an iceberg. For every lawyer who has been publicly sanctioned, there are likely many more who have submitted AI-generated citations that happened to go unchecked — or who caught the errors only at the last moment before filing. The risk is systemic, and it is growing as more practitioners turn to generic AI tools for research.
Why Generic AI Fails at Legal Research
The hallucination problem is the most dramatic failure mode, but it is far from the only reason generic AI is unsuitable for legal research. The deeper issue is that tools like ChatGPT were never designed for this purpose, and their architecture makes them fundamentally unsuited to the precision that legal work demands.
No verified case database. ChatGPT does not have access to BAILII, AustLII, HKLII, CourtListener, or any other authoritative legal database. It generates citations probabilistically, not by looking them up. This means every citation it produces must be independently verified — which defeats the purpose of using it for research in the first place.
No jurisdiction awareness. Legal terminology varies significantly between jurisdictions. A "motion to dismiss" in the United States is a "strike-out application" in England and Wales, an "application to dismiss" in Hong Kong, and a "summary dismissal application" in parts of Australia. ChatGPT routinely mixes these terms, applying US procedural concepts to UK cases, citing English statutes in response to questions about Australian law, and using terminology from one jurisdiction in the context of another. For a litigant who does not know the difference, this can lead to filing the wrong application in the wrong court using the wrong procedure.
No procedural knowledge. Legal practice is not just about knowing the law. It is about knowing which court to file in, which form to use, what fee to pay, what service requirements apply, and what the relevant time limits are. ChatGPT cannot reliably tell you any of this. Ask it which form you need to file a defence in the County Court in England, and it may give you a form number that does not exist, or one that was replaced years ago, or one from an entirely different jurisdiction.
No limitation period calculation. Getting a limitation period wrong can be catastrophic — if you miss it, your claim is dead. Limitation periods vary by jurisdiction, by cause of action, and by circumstance. They can be extended, suspended, or shortened by specific statutory provisions. ChatGPT will give you a number — often "six years" for contract claims in England, which is correct in many cases — but it cannot account for the exceptions, extensions, or jurisdictional variations that make limitation analysis a genuinely complex legal exercise.
No costs assessment. In many common law jurisdictions, the question of legal costs is as important as the merits of the case. If you win but are awarded costs on the standard basis rather than the indemnity basis, you may recover only 60-70% of what you spent. ChatGPT has no framework for assessing costs implications, proportionality, or the costs budgeting regimes that apply in many courts.
If you want to understand how different legal research tools stack up against each other on these dimensions, our comparison page breaks down the key differences between generic AI, traditional legal databases, and purpose-built legal AI systems.
What Verified Legal AI Looks Like
The failures of generic AI do not mean that AI is useless for legal research. They mean that the wrong kind of AI is useless — and dangerous — for legal research. Purpose-built legal AI tools address the specific problems that make ChatGPT unreliable.
Citation verification against real databases. A properly designed legal AI system does not generate citations probabilistically. It checks every case reference against authoritative databases — BAILII for England and Wales, AustLII for Australia, HKLII for Hong Kong, CourtListener for the United States, and equivalent resources for other jurisdictions. If a case cannot be verified, the system flags it or excludes it. This is the single most important difference between generic AI and purpose-built legal AI.
Jurisdiction-specific analysis. A good legal AI tool knows that the terminology, procedures, and court structures differ between jurisdictions. It uses the correct terms for the correct jurisdiction, applies the right procedural rules, and does not mix up US and UK concepts. When you ask about filing a defence in England, it tells you about the County Court or the High Court, not about federal district courts.
Procedural pathways. Beyond knowing the law, purpose-built legal AI can guide you through the practical steps of litigation: which court, which form, which fee, which deadline, what service requirements. This is the information that self-represented litigants most often get wrong, and it is the information that generic AI is least equipped to provide reliably.
Case strength scoring. Rather than simply telling you what the law says, verified legal AI tools can assess how the law applies to your facts and provide a structured evaluation of your case's strengths and weaknesses. This analysis is grounded in actual legal principles and real case law — not in a statistical model's best guess at what sounds persuasive.
CommonBench is built on these principles. Every citation is checked against real legal databases across five common law jurisdictions. The system provides jurisdiction-specific analysis with correct terminology, procedural guidance with specific courts and forms, and case strength assessments based on verified authorities. It is designed to give you research you can actually rely on — not research you have to spend hours verifying.
How to Use AI Safely for Legal Research
If you are going to use AI for legal research — and there are good reasons to do so — you need to do it safely. The following principles will help you avoid the traps that have caught out qualified lawyers, let alone self-represented litigants.
Never submit AI-generated citations without verifying each one. This is the non-negotiable rule. Before you include any case citation in any document that will be filed with a court or sent to an opposing party, you must independently confirm that the case exists, that the citation is correct, and that the holding is accurately stated. Check it on BAILII, AustLII, CourtListener, or whatever free legal database is relevant to your jurisdiction. If you cannot find it, do not use it.
Use AI for understanding legal concepts, not as a citation source. Generic AI tools like ChatGPT can be genuinely useful for explaining legal concepts in plain language. If you want to understand what "specific performance" means, how the test for an interim injunction works, or what the elements of negligence are, ChatGPT can give you a solid explanation. But do not take the cases it cites at face value. Use the explanation to understand the concept, then find real cases yourself.
Always have a qualified lawyer review any document before filing. If the stakes of your case justify it — and in most cases beyond the small claims track, they do — have a lawyer review any document before you file it. Many lawyers offer document review services at a fraction of the cost of full representation. This is especially important for written advocacy (skeleton arguments, briefs, or submissions) that will cite legal authorities.
Consider purpose-built legal AI tools with verification systems. If you are going to use AI for legal research, use one that was designed for legal research. Purpose-built tools with citation verification, jurisdiction-specific analysis, and procedural guidance are fundamentally different from generic chatbots. The difference is not marginal — it is the difference between a tool that checks its work and one that cannot.
Document your research process. If you are a lawyer using AI tools, keep a record of what you used, how you verified the output, and what additional research you conducted. Courts are increasingly asking about AI use in legal submissions, and being able to demonstrate a rigorous verification process will protect you if questions arise.
The Bottom Line
Generic AI is a useful thinking tool but a dangerous citation source. ChatGPT can help you brainstorm arguments, understand legal concepts, and organise your thoughts about a legal problem. It cannot reliably find real cases, cite real authorities, or guide you through real procedures. The gap between what it appears to do and what it actually does is where lawyers get sanctioned and litigants lose cases.
The stakes in legal matters are too high for unverified AI output. A fabricated citation does not just weaken your argument — it destroys your credibility with the court, potentially for the remainder of your case. Judges remember the lawyer or litigant who cited cases that did not exist. That memory does not fade when the next application comes before them.
Verified legal AI tools exist and should be preferred over generic chatbots for any serious legal research. The technology to do legal AI properly — with citation verification, jurisdiction-specific analysis, and procedural guidance — is available now. There is no reason to take the risk of using a tool that was never designed for legal work when purpose-built alternatives exist.
The legal system depends on trust. Trust between lawyers and judges, between litigants and the court, and between the public and the rule of law. Every fabricated citation submitted to a court erodes that trust. If you are going to use AI for legal research, use it responsibly — and use the right tool for the job.
This article is published by CommonBench. We build AI-powered legal research tools with verified citations across five common law jurisdictions. If you need legal research you can trust, try CommonBench. Every citation is checked against real legal databases, so you never have to wonder whether the case you are relying on actually exists.
Want verified legal analysis?
Try CommonBench — every citation is checked against real legal databases across five jurisdictions. Your first query is free.
Try CommonBench AI Chat →