Somewhere in a law firm or legal department right now, a lawyer is about to act on an AI output they cannot verify, cannot trace, and could not reproduce if asked to. Two colleagues may have asked the system the same compliance question and received different answers — without either of them knowing.
This is not a hypothetical. It is the default state of most AI deployments in legal today.
This article explains why black-box AI creates unacceptable risk for legal, compliance, and risk teams, and what a governed alternative looks like in practice.
Black-box AI refers to AI systems — typically large language model (LLM) assistants like general-purpose chatbots — that generate outputs through probabilistic reasoning. They are trained on vast quantities of text and produce responses that are statistically likely given the input, but not guaranteed to be consistent, traceable, or correct.
For many use cases, this is acceptable. For legal work, it is not.
Legal teams operate in an environment where the same set of facts must produce the same answer, regardless of who asks the question, when they ask it, or how they phrase it. A compliance workflow that routes a high-risk transaction to senior review cannot function if the routing logic changes based on subtle variations in the query. An employment policy tool that advises employees on their rights cannot be deployed if its outputs are unpredictable.
The core issue is not that general-purpose AI is unintelligent. It is that it is ungoverned.
There are three specific failure modes that ungoverned AI introduces into legal workflows.
Hallucination. Large language models can produce confident, well-structured, completely incorrect answers. In a legal context, a hallucinated case citation, a fabricated regulatory threshold, or an incorrect statement about jurisdiction-specific requirements can expose an organisation to significant liability — and the error may not be caught until damage is already done.
Inconsistency. Because LLMs generate responses probabilistically, the same question asked at different times, by different users, or with slightly different phrasing can yield materially different answers. This is incompatible with any workflow that requires uniformity — equal treatment in HR decisions, consistent application of contractual standards, standardised risk scoring across a portfolio of matters.
Non-auditability. When a decision is challenged — by a regulator, a counterparty, or an employee — a legal team must be able to explain how the decision was reached. “The AI said so” is not an answer. Black-box systems cannot produce an audit trail that shows which rules were applied, what logic was followed, and what the system knew at the time of the decision.
The pressure to adopt AI is genuine, and the tools available have improved rapidly. But the legal industry is at a critical inflection point: AI adoption is accelerating faster than AI governance frameworks are being established.
Many legal teams have deployed general-purpose AI tools — or are under executive pressure to do so — without having answered fundamental questions: What happens when the AI is wrong? Who is accountable for an AI-generated decision that turns out to be incorrect? How do we demonstrate compliance to a regulator who asks us to show our work?
These are not hypothetical concerns. Regulators across multiple jurisdictions are moving from guidance to enforcement on AI explainability and auditability:
Legal teams that have built workflows on black-box AI will face a significant remediation challenge as these requirements continue to mature and enforcement intensifies.
Governed AI — sometimes called deterministic AI — works differently from general-purpose LLM assistants. Rather than generating answers probabilistically, it encodes the expertise of legal and compliance professionals into rule-based systems that apply the same logic consistently, every time, to every user.
The key characteristics of a governed AI system for legal work are:
Predictable outputs. The same set of facts produces the same output, every time. A transaction that triggers a reporting obligation will always trigger a reporting obligation, regardless of who submits it or when.
Traceable logic. Every decision can be traced back to the rules that produced it. The system can show exactly which logic was applied, in what sequence, and on what basis — producing a complete audit trail.
Human expertise at the centre. The rules encoded in the system are the expert judgment of the lawyers and compliance professionals who built it. The AI does not replace that expertise; it scales it. One senior lawyer’s knowledge of a complex regulatory framework can be made available, consistently and accurately, to thousands of users across an organisation.
Generative AI as a tool, not an oracle. Governed systems can incorporate generative AI for specific tasks where it adds value — extracting data from unstructured documents, drafting initial contract language, summarising lengthy materials. But those outputs feed into deterministic workflows, where rules govern what happens next. The AI assists; it does not decide.
Consider a large organisation managing business traveller compliance across multiple jurisdictions. Each traveller’s trip may trigger different tax, immigration, and employment obligations depending on their home country, destination, duration, and the nature of their activities.
A general-purpose AI assistant can describe the general framework for business traveller compliance. It cannot reliably apply that framework to a specific traveller’s specific itinerary and produce a governed, auditable recommendation.
A deterministic system, built on the expertise of employment and immigration lawyers who know exactly which rules apply in which circumstances, can. It asks the right questions, applies the right logic, routes edge cases to the right human reviewer, and produces a complete record of every decision.
This is the distinction that matters for legal teams: not whether AI is involved, but whether the AI is governed.
Can we not just add guardrails to a general-purpose AI tool?
Guardrails can reduce the frequency of hallucinations and inappropriate outputs, but they cannot eliminate inconsistency or produce genuine auditability. A guardrail tells the AI what not to say; it does not replace the underlying probabilistic reasoning with deterministic logic. For legal teams that need to demonstrate to a regulator or auditor exactly how a decision was reached, guardrails are not sufficient.
Is governed AI less capable than general-purpose AI?
No — it is differently capable. Governed AI does not attempt to answer every question about everything. It applies expert legal and compliance logic to the specific workflows it has been built to handle, and it does so with a level of consistency and auditability that general-purpose AI cannot match. For the specific use cases where accuracy and accountability matter most, governed AI is the more capable tool.
How long does it take to build a governed AI workflow?
With a modern no-code automation platform, legal and compliance professionals can build governed workflows without engineering support. A prototype can be live in days; a production-grade solution typically takes weeks, not months. The expert knowledge that powers the system comes from the legal team, not from a vendor — which means the system reflects the organisation’s actual standards and risk appetite.
What happens when the rules change?
Because governed systems encode explicit logic rather than trained weights, updating the rules is straightforward. When a regulation changes, the relevant rule is updated and the system immediately applies the new logic. There is no need to retrain a model or wait for a new version — the system reflects current law as soon as it is updated.
Legal teams are right to take AI seriously. The efficiency gains are real, the competitive pressure is genuine, and the tools available today are genuinely powerful. But the question is not whether to use AI — it is which kind of AI, and under what governance framework.
For work where consistency matters, where auditability is required, and where the cost of a wrong answer is measured in regulatory exposure or reputational damage, black-box AI is not an appropriate tool. What legal teams need is AI that is governed: rule-based, traceable, grounded in human expertise, and designed to produce the same defensible outcome every time.
Neota Logic works with more than 70 law firms and legal departments, has powered more than 8 million sessions, and has driven more than $100 million in value by helping legal teams build exactly this kind of governed workflow infrastructure. The goal is not to replace the judgment of legal professionals — it is to make that judgment available, consistently and at scale, to everyone who needs it.