OCR Overview

Specialized OCR for the messy reality of legal documents. We handle what generic providers can’t: handwriting, poor scans, fax headers, and complex tables.

Quick example

import Casedev from 'casedev';

const client = new Casedev({ apiKey: process.env.CASEDEV_API_KEY });

// Submit your user's document for processing
const job = await client.ocr.v1.process({
  document_url: uploadedDocumentUrl
});

// Poll for completion
let result = await client.ocr.v1.retrieve(job.id);
while (result.status === 'pending' || result.status === 'processing') {
  await new Promise(r => setTimeout(r, 2000));
  result = await client.ocr.v1.retrieve(job.id);
}

// Return extracted text to your user
const text = await client.ocr.v1.download(job.id, 'text');
console.log(text);

Optimized for Legal

Feature	Why it matters for your app
Handwriting Recognition	Extract notes and annotations from uploaded documents
Table Reconstruction	Preserve structure for financial statements and forms
Bates Stamp Handling	Identify and index reference numbers separately
Searchable PDF (HOCR)	Return documents with text layers your users can search

Engine Selection

Choose based on your users’ document types:

Engine	Best for	Speed
`doctr`	Standard documents. High speed, good accuracy for typed text.	Fast
`paddleocr`	Tables and forms. Best-in-class table structure recognition.	Slower

Output formats

Format	Description
`text`	Plain text extraction
`json`	Structured output with coordinates, confidence scores
`pdf`	Searchable PDF (original with text layer)

Endpoints

Process

POST /ocr/v1/process — Submit a document for OCR

Status

GET /ocr/v1/:id — Check processing status

Download

GET /ocr/v1/:id/download/:type — Download results

Common patterns

With webhooks (recommended for large files)

const job = await client.ocr.v1.process({
  document_url: uploadedDocumentUrl,
  callback_url: 'https://your-app.com/webhooks/ocr-complete'
});
// We POST results to your callback when done

From S3

const job = await client.ocr.v1.process({
  document_url: 's3://your-bucket/documents/upload.pdf'
});
// We handle presigning automatically

With table extraction

const job = await client.ocr.v1.process({
  document_url: uploadedDocumentUrl,
  engine: 'paddleocr',
  features: {
    tables: { format: 'csv' }
  }
});

Vault

Store OCR’d documents and make them searchable with semantic search

LLMs

Analyze extracted text with AI—summarize, classify, and extract entities

Get Started

Platform

Resources

Quick example

Optimized for Legal

Engine Selection

Output formats

Endpoints

Process

Status

Download

Common patterns

With webhooks (recommended for large files)

From S3

With table extraction

Vault

LLMs

Get Started

Platform

Resources

​Quick example

​Optimized for Legal

​Engine Selection

​Output formats

​Endpoints

Process

Status

Download

​Common patterns

​With webhooks (recommended for large files)

​From S3

​With table extraction

​Related services

Vault

LLMs

Quick example

Optimized for Legal

Engine Selection

Output formats

Endpoints

Common patterns

With webhooks (recommended for large files)

From S3

With table extraction

Related services