Transparency
Architecture &
transparency.
A direct description of how CommonBench is built. If something here is unclear, or you want a deeper view of a specific component, email cases@commonbench.ai.
Litigators trust tools they can see inside. This page exists so that you can decide whether the system is something you want in your workflow before you subscribe — and so you know what to push back on if it ever gets something wrong.
00 · System
What the system is
CommonBench is a server-rendered chat application that runs your legal question through a structured prompt pipeline against a frontier large language model, augmented with a curated database of authorities across five common law jurisdictions. The output is post-processed for citation verification, structural quality, and disclosure compliance before it reaches you.
01 · Corpus
The corpus
4,600+ hand-curated and scraper-augmented case authorities across the UK, US, Australia, Hong Kong, and Singapore.
4,600+
Verified authorities
2,200+
United Kingdom
1,100+
United States
450+
Australia
440+
Hong Kong
320+
Singapore
- Each row carries:
id,name,citation,year,court,jurisdiction,topic,principle, optionalratio,summary_one_line,key_paragraphs,doctrinal_status,leading_case,cited_by_count,last_verified. - SQLite-backed with FTS5 full-text search; doctrinal columns populated by a separate tagging pipeline that uses the model to enrich each row with structured metadata.
- The "Corpus as of" line beneath every chat response is the maximum
last_verifiedacross the table — a real signal of when the corpus was last refreshed.
02 · Retrieval
The retrieval
- Each user message is classified into a complexity level (L1 / L2 / L3) by a server-side heuristic that looks at length, multi-issue indicators, and document attachment.
- L1 queries get a focused doctrinal lookup; L2 / L3 queries trigger a richer retrieval pass that pulls authorities from the FTS index and re-ranks for jurisdiction match and doctrinal weight.
- The selected authorities are inlined into the model's system prompt as structured context. The model is instructed never to invent citations.
03 · Model
The model
- Anthropic Claude is the underlying frontier model (the model id is visible in the per-response "Source" panel inside the chat).
- Prompts are tier- and jurisdiction-specific; the static portion is cached at the provider layer to reduce cost on repeat queries.
- L3 queries get extended thinking budget so the model can reason through multi-issue analysis before composing the response.
04 · Verification
Citation verification
- After the model finishes streaming, every recognised case citation in the response is matched against our database and against external sources (BAILII, CourtListener, AustLII, HKLII, vLex where applicable).
- Citations that match cleanly get a green tick. Citations that do not match are flagged as unverified. Citations that pattern-match a fabricated form are stripped before the response is committed to the page.
- The verification result is streamed back as a separate frame and the badges are applied in-place.
05 · Storage
What we store
- Saved chats: stored in your browser's localStorage, not on our servers. Clearing your browser cache erases them.
- Account and subscription state (email, Stripe IDs, usage counters): stored server-side in encrypted-at-rest data files.
- Anonymised telemetry: latency, error rates, feedback submissions, error reports. No prompt content in telemetry.
- Document uploads for review (Advocate / Chambers): processed in memory, not persisted after extraction.
06 · Boundaries
What we don't do
- We don't train a model on your conversations.
- We don't sell or share your queries.
- We don't claim to provide legal advice, and we don't pretend that machine output replaces a qualified practitioner.
- We don't quietly disable the disclosure that the output is machine-generated.
07 · Open questions
Where the edges are
The corpus is a partial coverage of each jurisdiction. The retrieval pass is good but not perfect. The model is fast but can still generate analysis that misses a recent first-instance decision. The error-report button on every response feeds a moderation queue that we read. Tell us when we're wrong.
For the honest scope of what the system can and can't do, read the limitations page.
For questions about this page, email cases@commonbench.ai. For data protection and privacy, see the Privacy Policy. For commercial terms, see the Terms of Service.