Process document

Submit a document for OCR. We extract text, detect tables, and optionally generate a searchable PDF. Processing is async — you get a job ID immediately, then poll for results or use webhooks.

Endpoint

POST /ocr/v1/process

curl -X POST https://api.case.dev/ocr/v1/process \
  -H "Authorization: Bearer sk_case_YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "document_url": "https://storage.example.com/scanned-deposition.pdf"
  }'

Response

{
  "id": "1f4a195e-026b-41ff-b367-c61089f5f367",
  "status": "pending",
  "document_url": "https://storage.example.com/scanned-deposition.pdf",
  "engine": "doctr",
  "created_at": "2025-11-04T09:30:12Z",
  "links": {
    "self": "https://api.case.dev/ocr/v1/1f4a195e-026b-41ff-b367-c61089f5f367",
    "text": "https://api.case.dev/ocr/v1/1f4a195e-026b-41ff-b367-c61089f5f367/download/text",
    "json": "https://api.case.dev/ocr/v1/1f4a195e-026b-41ff-b367-c61089f5f367/download/json"
  }
}

Parameters

Required

Parameter	Type	Description
`document_url`	string	URL to your document. HTTP/HTTPS or `s3://`

Optional

Parameter	Type	Default	Description
`document_id`	string	auto-generated	Your internal reference ID
`engine`	string	`doctr`	OCR engine (see below)
`callback_url`	string	—	Webhook URL for completion notification
`features`	object	`{}`	Additional processing options

OCR engines

Engine	Best for	Speed
`doctr`	Clean printed text, typed documents	Fast
`paddleocr`	Tables, forms, complex layouts, handwriting	Medium

For legal documents: Start with doctr. If you’re getting poor results on forms or tables, try paddleocr.

Features

Enable additional processing:

JSON

{
  "features": {
    "embed": {},          // Generate searchable PDF
    "tables": {           // Extract tables as CSV
      "format": "csv"
    }
  }
}

Checking status

Poll the job to check if processing is complete:

const result = await client.ocr.v1.retrieve(job.id);

if (result.status === 'completed') {
  // Download the extracted text
  const text = await client.ocr.v1.download(job.id, 'text');
  console.log(text);
}

Using webhooks

For large documents, use webhooks instead of polling:

const job = await client.ocr.v1.process({
  document_url: 'https://storage.example.com/500-page-discovery.pdf',
  callback_url: 'https://your-app.com/api/ocr-complete'
});

We POST the completed job to your callback URL when processing finishes.

S3 URLs

If your document is in S3, use an s3:// URL:

const job = await client.ocr.v1.process({
  document_url: 's3://your-bucket/documents/deposition.pdf'
});

We automatically generate a presigned URL to access the file.

Examples

Scanned deposition

const job = await client.ocr.v1.process({
  document_url: 'https://storage.example.com/deposition-smith.pdf',
  document_id: 'smith-depo-2024',
  engine: 'doctr',
  features: { embed: {} }  // Generate searchable PDF
});

Medical records with tables

const job = await client.ocr.v1.process({
  document_url: 'https://storage.example.com/patient-records.pdf',
  engine: 'paddleocr',  // Better for tables and forms
  features: {
    tables: { format: 'csv' },
    embed: {}
  },
  callback_url: 'https://your-app.com/webhooks/ocr'
});

Handwritten notes

const job = await client.ocr.v1.process({
  document_url: 'https://storage.example.com/witness-notes.jpg',
  engine: 'paddleocr'  // Better for handwriting
});

Get Started

Platform

Resources

Parameters

Required

Optional

OCR engines

Features

Checking status

Using webhooks

S3 URLs

Examples

Scanned deposition

Medical records with tables

Handwritten notes

Get Started

Platform

Resources

​Parameters

​Required

​Optional

​OCR engines

​Features

​Checking status

​Using webhooks

​S3 URLs

​Examples

​Scanned deposition

​Medical records with tables

​Handwritten notes

Parameters

Required

Optional

OCR engines

Features

Checking status

Using webhooks

S3 URLs

Examples

Scanned deposition

Medical records with tables

Handwritten notes