Skip to main content

The problem

You have thousands of documents—contracts, depositions, evidence. You need to find the relevant passages, but keyword search fails when documents use different terminology.

The solution

A vault is a container that understands documents. When you upload files, we automatically:
  1. Extract text (OCR for scans, transcription for audio)
  2. Split content into searchable chunks
  3. Create meaning vectors for semantic search
Then you can ask questions in plain English and get relevant results—even if the documents use different terminology.

How RAG works

Traditional search matches keywords. If you search “timeline” and the document says “schedule,” you get nothing. RAG (Retrieval-Augmented Generation) works differently:
┌─────────────────────────────────────────────────────────────────┐
│  1. UPLOAD                                                       │
│     Your user uploads a document                                 │
│                           ↓                                      │
│  2. PROCESS                                                      │
│     We extract text (OCR, transcription, etc.)                  │
│     Split it into chunks (paragraphs, sections)                 │
│     Convert each chunk into a "meaning vector"                  │
│                           ↓                                      │
│  3. STORE                                                        │
│     Chunks + vectors go into the vault                          │
│                                                                  │
│  ─────────────────────────────────────────────────────────────  │
│                                                                  │
│  4. SEARCH                                                       │
│     Your user asks: "What about the timeline?"                  │
│     We convert the question to a vector                         │
│     Find chunks with similar meaning (not just keywords)        │
│     Return the relevant passages                                │
└─────────────────────────────────────────────────────────────────┘
The magic is in step 3. “Timeline” and “schedule” have similar meaning vectors, so they match—even though the words are different.

What you can upload

File typeWhat happens
PDF (scanned)OCR extracts the text
PDF (digital)Text extracted directly
ImagesOCR extracts any text
Audio/VideoTranscribed with speaker labels
FTR court recordingsConverted to audio, then transcribed
Word, text filesIndexed directly

Quick start

import Casedev from 'casedev';

const client = new Casedev({ apiKey: process.env.CASEDEV_API_KEY });

// 1. Create a vault for your user
const vault = await client.vault.create({
  name: 'Documents - User 12345'
});

// 2. Upload a document
const upload = await client.vault.upload(vault.id, {
  filename: 'contract.pdf',
  contentType: 'application/pdf'
});

// PUT the file to upload.uploadUrl, then:

// 3. Process it (OCR + vectorization)
await client.vault.ingest(vault.id, upload.objectId);

// 4. Enable your user to search by meaning
const results = await client.vault.search(vault.id, {
  query: userSearchQuery // e.g., "termination clauses"
});

console.log(results.chunks[0].text);

Security

Each vault is isolated and encrypted:
  • Encryption at rest — AES-256 via AWS KMS
  • Isolation — Separate storage per vault
  • Zero-knowledge — We cannot read your documents
  • Audit trail — Every access logged
One vault per case is the recommended pattern. Keeps search results focused and data isolated.

Next steps