Digitize Evidence

The problem: Opposing counsel sent you 500 pages of blurry photocopies. You need to search them, but they’re just images. The solution: Run OCR to extract text, then search or analyze with AI.

1. Submit for OCR

import Casedev from 'casedev';

const client = new Casedev({ apiKey: process.env.CASEDEV_API_KEY });

// Process a document uploaded by your user
const job = await client.ocr.v1.process({
  document_url: documentUrl, // URL from your user's upload
  engine: 'doctr',  // Fast, good for printed text
  features: {
    embed: {}  // Generate searchable PDF
  }
});

console.log(`OCR job started: ${job.id}`);

2. Wait for completion

OCR runs asynchronously. Poll for status or use webhooks to notify your users:

// Poll for completion
let result = await client.ocr.v1.retrieve(job.id);

while (result.status === 'processing' || result.status === 'pending') {
  console.log(`Status: ${result.status} (${result.chunks_completed}/${result.chunk_count} pages)`);
  await new Promise(r => setTimeout(r, 5000));
  result = await client.ocr.v1.retrieve(job.id);
}

if (result.status === 'completed') {
  console.log(`✅ OCR complete! ${result.page_count} pages processed.`);
  console.log(`Confidence: ${(result.confidence * 100).toFixed(1)}%`);
}

3. Download results

Provide extracted text, structured data, or a searchable PDF:

// Download plain text for your user
const text = await client.ocr.v1.download(job.id, 'text');

// Download searchable PDF (original with invisible text layer)
const pdf = await client.ocr.v1.download(job.id, 'pdf');
fs.writeFileSync('searchable-document.pdf', Buffer.from(pdf));

// Download structured JSON (with word coordinates for highlighting)
const json = await client.ocr.v1.download(job.id, 'json');
console.log(`Extracted ${json.pages.length} pages`);

4. Analyze with AI

Enhance your feature with automatic data extraction:

// Extract key information for your user
const analysis = await client.llm.v1.chat.createCompletion({
  model: 'anthropic/claude-sonnet-4.5',
  messages: [
    {
      role: 'system',
      content: 'Extract key dates, parties, and claims from this document. Format as JSON.'
    },
    {
      role: 'user',
      content: text
    }
  ],
  temperature: 0  // Deterministic for factual extraction
});

// Return structured data to your user
console.log(analysis.choices[0].message.content);

OCR engines

Choose the right engine based on your users’ document types:

Engine	Best for	Speed
`doctr`	Clean printed text	Fast
`paddleocr`	Tables, forms, complex layouts	Slower

Recommendation: Start with doctr for most use cases. Switch to paddleocr if your users need table extraction or have complex document layouts.

Get Started

Platform

Resources

Digitize Evidence

1. Submit for OCR

2. Wait for completion

3. Download results

4. Analyze with AI

OCR engines

Get Started

Platform

Resources

​1. Submit for OCR

​2. Wait for completion

​3. Download results

​4. Analyze with AI

​OCR engines

1. Submit for OCR

2. Wait for completion

3. Download results

4. Analyze with AI

OCR engines