Process document with OCR

POST

ocr

process

curl --request POST \
  --url https://api.case.dev/ocr/v1/process \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "document_url": "https://example.com/contract.pdf",
  "document_id": "contract-2024-001",
  "callback_url": "https://your-app.com/webhooks/ocr-complete",
  "engine": "doctr",
  "features": {
    "embed": {},
    "tables": {
      "format": "csv"
    }
  },
  "result_bucket": "my-ocr-results",
  "result_prefix": "ocr/2024/"
}
'

{
  "id": "ocr_job_67890",
  "status": "queued",
  "document_id": "contract-2024-001",
  "engine": "doctr",
  "page_count": 15,
  "created_at": "2024-01-15T10:30:00Z",
  "estimated_completion": "2024-01-15T10:45:00Z"
}

Authorizations

Authorization

string

header

required

API key starting with sk_case_

Body

application/json

document_url

string

required

URL or S3 path to the document to process

Example:

"https://example.com/contract.pdf"

document_id

string

Optional custom document identifier

Example:

"contract-2024-001"

callback_url

string

URL to receive completion webhook

Example:

"https://your-app.com/webhooks/ocr-complete"

engine

enum<string>

default:doctr

OCR engine to use

Available options:

doctr,

paddleocr

Example:

"doctr"

features

object

Additional processing options

Show child attributes

Example:

{
  "embed": {},
  "tables": { "format": "csv" }
}

result_bucket

string

S3 bucket to store results

Example:

"my-ocr-results"

result_prefix

string

S3 key prefix for results

Example:

"ocr/2024/"

Response

OCR job created successfully

string

Unique job identifier

status

enum<string>

Current job status

Available options:

queued,

processing,

completed,

failed

document_id

string

Document identifier

engine

string

OCR engine used

page_count

integer

Number of pages detected

created_at

string<date-time>

Job creation timestamp

estimated_completion

string<date-time>

Estimated completion time

Detect privileged contentAnalyzes text or vault documents for legal privilege. Detects attorney-client privilege, work product doctrine, common interest privilege, and litigation hold materials. Returns structured privilege flags with confidence scores and policy-friendly rationale suitable for discovery workflows and privilege logs. **Size Limit:** Maximum 200,000 characters (larger documents rejected). **Permissions:** Requires `chat` permission. When using `document_id`, also requires `vault` permission. **Note:** When analyzing vault documents, results are automatically stored in the document's `privilege_analysis` metadata field.

⌘I

Process document with OCR

curl --request POST \
  --url https://api.case.dev/ocr/v1/process \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "document_url": "https://example.com/contract.pdf",
  "document_id": "contract-2024-001",
  "callback_url": "https://your-app.com/webhooks/ocr-complete",
  "engine": "doctr",
  "features": {
    "embed": {},
    "tables": {
      "format": "csv"
    }
  },
  "result_bucket": "my-ocr-results",
  "result_prefix": "ocr/2024/"
}
'

{
  "id": "ocr_job_67890",
  "status": "queued",
  "document_id": "contract-2024-001",
  "engine": "doctr",
  "page_count": 15,
  "created_at": "2024-01-15T10:30:00Z",
  "estimated_completion": "2024-01-15T10:45:00Z"
}

Agents

Auth

Compute

Database

Format

System

Legal

LLMs

Memory

OCR

Privilege

Search

SuperDoc

Translation

Vaults

Voice

Applications Deployments

Applications Projects

Applications Domains

Applications Env Vars

Process document with OCR

Authorizations

Body

Response