Skip to main content
POST
/
ocr
/
v1
/
process
Process document with OCR
curl --request POST \
  --url https://api.case.dev/ocr/v1/process \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "document_url": "https://example.com/contract.pdf",
  "document_id": "contract-2024-001",
  "callback_url": "https://your-app.com/webhooks/ocr-complete",
  "engine": "doctr",
  "features": {
    "text": true,
    "tables": true,
    "forms": false
  },
  "result_bucket": "my-ocr-results",
  "result_prefix": "ocr/2024/"
}
'
{
  "id": "ocr_job_67890",
  "status": "queued",
  "document_id": "contract-2024-001",
  "engine": "doctr",
  "page_count": 15,
  "created_at": "2024-01-15T10:30:00Z",
  "estimated_completion": "2024-01-15T10:45:00Z"
}

Authorizations

Authorization
string
header
required

API key starting with sk_case_

Body

application/json
document_url
string
required

URL or S3 path to the document to process

Example:

"https://example.com/contract.pdf"

document_id
string

Optional custom document identifier

Example:

"contract-2024-001"

callback_url
string

URL to receive completion webhook

Example:

"https://your-app.com/webhooks/ocr-complete"

engine
enum<string>
default:doctr

OCR engine to use

Available options:
doctr,
paddleocr
Example:

"doctr"

features
object

OCR features to extract

Example:
{
  "text": true,
  "tables": true,
  "forms": false
}
result_bucket
string

S3 bucket to store results

Example:

"my-ocr-results"

result_prefix
string

S3 key prefix for results

Example:

"ocr/2024/"

Response

OCR job created successfully

id
string

Unique job identifier

status
enum<string>

Current job status

Available options:
queued,
processing,
completed,
failed
document_id
string

Document identifier

engine
string

OCR engine used

page_count
integer

Number of pages detected

created_at
string<date-time>

Job creation timestamp

estimated_completion
string<date-time>

Estimated completion time