A routine land dispute in Andhra Pradesh uncovered a stark new risk: a judge relied on four legal precedents that turned out not to exist. All four were invented by an AI tool — plausible‑sounding case names and reasoning generated out of thin air. The mistake was exposed on appeal and reached India’s Supreme Court, which in late February described a ruling based on fabricated AI citations not merely as an error but as “misconduct.” Notices were issued to the attorney general, solicitor general and the Bar Council of India.
“It is not a question of whether we should integrate AI or not but it is the question of how far the due diligence should be,” said Sindoora VNL, a lawyer for the defendants. “The court indicated this might be a question of misconduct. Now we have to see how far they are willing to take it.”
India’s dilemma mirrors problems seen in other countries as courts and lawyers adopt AI faster than regulation can keep pace. In Colombia, a judge included a ChatGPT transcript in a ruling on medical treatment for an autistic child, saying the tool had “assist[ed], not replace[d]” judicial reasoning. In the United States, two New York lawyers were sanctioned for submitting a brief that cited six cases invented by a chatbot.
Not all episodes are scandals. In March 2023 a judge of the Punjab and Haryana High Court paused a bail hearing in a murder case and openly consulted ChatGPT to get broader context on bail law when assault involved cruelty. He denied bail and disclosed his use of the chatbot — a candid moment that became a cautionary tale when advocates warned the tool can invent facts and mirror biases in its training data.
“AI cannot replace human conscience in justice delivery,” said Mimansa Ambastha, founder of Starlex Consultants and an adviser on AI and cybersecurity. “The danger is that the balance between assistance and deference can slip. And when it slips in a bail hearing, a person’s liberty is at stake.”
That danger is particularly acute in India because bail is consequential and delays are endemic. Hundreds of thousands are held as undertrial prisoners — accused but not convicted — often spending years behind bars while cases inch forward. The judiciary is overwhelmed: roughly 55 million cases are pending across courts, judges routinely handle hundreds of active files, and more than 180,000 matters remain unresolved after 30 years in trial. In one recent example the Uttar Pradesh High Court acquitted three men who had spent 38 years in prison for a 1982 murder. A 2018 government estimate suggested it would take centuries at then‑current rates to clear the backlog.
Given that pressure, AI’s promise of speed is attractive. But judges and reformers warn the rush to use tools can create new work and risks. Chief Justice Surya Kant observed that AI is paradoxically generating additional verification tasks as court staff must now check whether AI‑generated citations actually exist before proceedings continue.
Beyond hallucinated cases, a deeper worry is that AI can inherit and amplify existing inequalities embedded in legal and policing records. Models trained on decades of judgments, police reports and filings may replicate historical patterns of discrimination and present them as neutral signals. National Crime Records Bureau data shows Muslims and Dalits are overrepresented among undertrial prisoners relative to their share of the population — disparities that, if encoded into predictive systems, could be misread as indicators of individual risk rather than evidence of structural bias.
“AI systems do not create bias out of thin air. They replicate what they are trained on,” Ambastha said. Research on language models used in India has found they can reproduce caste and religion stereotypes present in their training material. Matheus Puppe, a lawyer researching AI and law, warns that when algorithmic outputs are presented with computational authority, judges and lawyers may treat them as objective. “The concern is that AI may reproduce structural distortions embedded in legal systems,” he said. “Once those patterns are translated into algorithms, they gain a veneer of scientific legitimacy.”
Other countries offer both cautionary and constructive examples. Brazil has implemented AI to group similar petitions, spot litigation patterns and automate routine procedural steps — useful where caseloads are massive and predictable. Researchers stress these systems should remain operational supports, not decision‑makers.
In India, innovators and courts are pursuing assistive models rather than outright bans. Sudipto Ghosh, founder of the legal model InLegalLLaMA, says it is trained on Indian statutes, judgments and procedure to retrieve relevant law, generate summaries and draft basic arguments. India’s Supreme Court e‑committee has developed SUPACE, an AI research assistant intended to help judges sift voluminous case law and present relevant passages; officials insist it does not make recommendations or substitute for judicial judgment.
Even proponents emphasize verification. Ghosh acknowledges models can produce confident but incorrect outputs if users do not check them. “You can reduce time, you can improve access,” he said. “But you cannot outsource judgment.”
The Andhra Pradesh episode and subsequent actions by the Supreme Court have pushed the Indian legal community to confront how AI should be governed in courts: what standards of due diligence to require, how to preserve accountability, and how to prevent algorithmic reproductions of injustice. As judges, lawyers and technologists experiment with tools that promise relief from an overburdened system, the central challenge remains ensuring efficiency does not come at the cost of fairness, transparency or the moral reasoning at the heart of justice.
This article was supported by the Tarbell Center for AI Journalism. Edited by Srinivas Mazumdaru.