Architecture & Transparency

A direct description of how CommonBench is built. Not a marketing page. If something here is unclear or you want a deeper view of a specific component, email cases@commonbench.ai.

Litigators trust tools they can see inside. This page exists so that you can decide whether the system is something you want in your workflow before you subscribe — and so you know what to push back on if it ever gets something wrong.

What the system is

CommonBench is a server-rendered chat application that runs your legal question through a structured prompt pipeline against a frontier large language model, augmented with a curated database of authorities across five common law jurisdictions. The output is post-processed for citation verification, structural quality, and disclosure compliance before it reaches you.

The corpus

~2,000 hand-curated and scraper-augmented case authorities across UK, US, HK, SG, AU.
Each row carries: id, name, citation, year, court, jurisdiction, topic, principle, optional ratio, summary_one_line, key_paragraphs, doctrinal_status, leading_case, cited_by_count, last_verified.
SQLite-backed with FTS5 full-text search; doctrinal columns populated by a separate tagging pipeline that uses the model to enrich each row with structured metadata.
The "Corpus as of" line beneath every chat response is the maximum last_verified across the table — a real signal of when the corpus was last refreshed.

The retrieval

Each user message is classified into a complexity level (L1 / L2 / L3) by a server-side heuristic that looks at length, multi-issue indicators, and document attachment.
L1 queries get a focused doctrinal lookup; L2 / L3 queries trigger a richer retrieval pass that pulls authorities from the FTS index and re-ranks for jurisdiction match and doctrinal weight.
The selected authorities are inlined into the model's system prompt as structured context. The model is instructed never to invent citations.

The model

Anthropic Claude is the underlying frontier model (model id is visible in the per-response "Source" panel inside the chat).
Prompts are tier- and jurisdiction-specific; the static portion is cached at the provider layer to reduce cost on repeat queries.
L3 queries get extended thinking budget so the model can reason through multi-issue analysis before composing the response.

Citation verification

After the model finishes streaming, every recognised case citation in the response is matched against our database and against external sources (BAILII, CourtListener, AustLII, HKLII, vLex where applicable).
Citations that match cleanly get a green tick. Citations that do not match are flagged as unverified. Citations that pattern-match a fabricated form are stripped before the response is committed to the page.
The verification result is streamed back as a separate frame and the badges are applied in-place.

What we store

Saved chats: stored in your browser's localStorage, not on our servers. Closing your browser cache erases them.
Account and subscription state (email, Stripe IDs, usage counters): stored server-side in encrypted-at-rest data files.
Anonymised telemetry: latency, error rates, feedback submissions, error reports. No prompt content in telemetry.
Document uploads for review (Advocate / Chambers): processed in memory, not persisted after extraction.

What we don't do

We don't train a model on your conversations.
We don't sell or share your queries.
We don't claim to provide legal advice, and we don't pretend that machine output replaces a qualified practitioner.
We don't quietly disable the disclosure that the output is machine-generated.

Open questions

The corpus is a partial coverage of each jurisdiction. The retrieval pass is good but not perfect. The model is fast but can still generate analysis that misses a recent first-instance decision. The error-report button on every response feeds a moderation queue that we read. Tell us when we're wrong.

For questions about this page, email cases@commonbench.ai. For data protection and privacy, see the Privacy Policy. For commercial terms, see the Terms of Service.