Transcription

Speaker identification, 100+ languages, word-level timestamps. Perfect for depositions, hearings, and interviews.

Endpoint

POST /voice/transcription

Vault Mode (Recommended)

Upload your audio to a vault, then transcribe with automatic result storage. The transcript is saved back to your vault when complete.

import Casedev from 'casedev';

const client = new Casedev({ apiKey: 'sk_case_YOUR_API_KEY' });

// 1. Upload audio to vault
const upload = await client.vault.upload('vault_abc123', {
  filename: 'deposition.mp3',
  contentType: 'audio/mpeg'
});
// Upload file to upload.uploadUrl via PUT

// 2. Start transcription
const job = await client.voice.transcription.create({
  vault_id: 'vault_abc123',
  object_id: upload.objectId,
  speaker_labels: true
});

console.log(job.id);  // tr_xyz789

Response

{
  "id": "tr_rvy731o5zxur0dg72sh3mjar",
  "status": "processing",
  "vault_id": "vault_abc123",
  "source_object_id": "obj_xyz789"
}

Get Results (Vault Mode)

const result = await client.voice.transcription.retrieve('tr_rvy731o5zxur0dg72sh3mjar');

if (result.status === 'completed') {
  console.log(result.result_object_id);  // Transcript stored here
  console.log(result.word_count);        // 594
  console.log(result.confidence);        // 97

  // Download transcript from vault
  const transcript = await client.vault.objects.download(
    result.vault_id,
    result.result_object_id
  );
}

Response (completed)

{
  "id": "tr_rvy731o5zxur0dg72sh3mjar",
  "status": "completed",
  "vault_id": "vault_abc123",
  "source_object_id": "obj_xyz789",
  "result_object_id": "obj_abc456",
  "audio_duration": 238,
  "word_count": 594,
  "confidence": 97
}

Vault Mode Benefits:

Transcript automatically saved to your vault
No webhook setup required
Simpler polling with result_object_id
Audio stored securely in your vault

Direct URL Mode

For audio hosted elsewhere, provide a public URL directly.

import Casedev from 'casedev';

const client = new Casedev({ apiKey: 'sk_case_YOUR_API_KEY' });

const job = await client.voice.transcription.create({
  audio_url: 'https://storage.example.com/deposition.m4a',
  speaker_labels: true,
  auto_chapters: true
});

console.log(job.id);  // Poll this for results

Response

{
  "id": "474e21cf-fd65-45d4-97fd-87558f7caf9b",
  "status": "queued",
  "audio_url": "https://storage.example.com/deposition.m4a",
  "created_at": "2025-11-04T09:15:30Z"
}

Parameters

Vault Mode

Parameter	Type	Required	Description
`vault_id`	string	Yes	Vault containing the audio file
`object_id`	string	Yes	Object ID of the audio file
`format`	string	No	Output format: `json` (default) or `text`
`speaker_labels`	boolean	No	Identify different speakers
`language_code`	string	No	Language code (auto-detected if omitted)

Direct URL Mode

Parameter	Type	Required	Description
`audio_url`	string	Yes	URL to audio/video file (max 5GB, 10 hours)
`webhook_url`	string	No	URL for completion notification

Shared Options

Parameter	Type	Default	Description
`speaker_labels`	boolean	`false`	Identify different speakers
`speakers_expected`	number	—	Expected number of speakers
`language_code`	string	auto	Language code (en, es, fr, de, etc.)
`speech_models`	array	`["universal-3-pro", "universal-2"]`	Priority-ordered speech models to use
`punctuate`	boolean	`true`	Add punctuation
`format_text`	boolean	`true`	Format numbers, dates, etc.
`word_boost`	array	—	Boost specific words (e.g., legal terms)
`auto_highlights`	boolean	`false`	Detect key phrases
`content_safety_labels`	boolean	`false`	Flag sensitive content

Get Results (Direct URL Mode)

const result = await client.voice.transcription.retrieve(jobId);

if (result.status === 'completed') {
  console.log(result.text);  // Full transcript

  // With speaker labels
  for (const utterance of result.utterances) {
    console.log(`${utterance.speaker}: ${utterance.text}`);
  }
}

Response (completed)

{
  "id": "474e21cf-fd65-45d4-97fd-87558f7caf9b",
  "status": "completed",
  "audio_duration": 3847000,
  "confidence": 0.94,
  "text": "Q: Can you state your name for the record?\nA: My name is Dr. Sarah Johnson...",
  "utterances": [
    {
      "speaker": "A",
      "text": "Can you state your name for the record?",
      "start": 120,
      "end": 2450
    },
    {
      "speaker": "B",
      "text": "My name is Dr. Sarah Johnson.",
      "start": 2450,
      "end": 4820
    }
  ],
  "chapters": [
    {
      "headline": "Witness Introduction",
      "summary": "Introduction and witness identification",
      "start": 120,
      "end": 15000
    }
  ]
}

Status Values

Status	Meaning
`queued`	Waiting to start
`processing`	Transcribing
`completed`	Done, results ready
`failed`	Error occurred

Processing Times

Audio Length	Time
1 minute	~15 seconds
10 minutes	~1-2 minutes
1 hour	~8-10 minutes
3 hours	~20-30 minutes

Examples

Deposition with Speaker Labels (Vault Mode)

// Upload to vault first
const upload = await client.vault.upload('vault_depositions', {
  filename: 'smith-deposition-2024.mp3',
  contentType: 'audio/mpeg'
});

const job = await client.voice.transcription.create({
  vault_id: 'vault_depositions',
  object_id: upload.objectId,
  speaker_labels: true,
  speakers_expected: 4,  // Attorney, witness, court reporter, judge
  word_boost: ['plaintiff', 'defendant', 'objection', 'sustained', 'overruled']
});

// Poll for completion
let result = await client.voice.transcription.retrieve(job.id);
while (result.status === 'processing') {
  await new Promise(r => setTimeout(r, 5000));
  result = await client.voice.transcription.retrieve(job.id);
}

// Transcript is now in vault at result.result_object_id

Court Recording (Direct URL with Webhook)

const job = await client.voice.transcription.create({
  audio_url: 'https://storage.example.com/3-hour-hearing.m4a',
  speaker_labels: true,
  webhook_url: 'https://your-app.com/webhooks/transcription'
});
// Results POSTed to your webhook when done

Supported Formats

Audio: MP3, M4A, WAV, FLAC, OGG, OPUS, WebM
Video: MP4, WebM, MOV, AVI, MKV (audio track extracted)
Languages: 100+ including English, Spanish, French, German, Chinese, Japanese

Pricing: $0.01/minute. A 2-hour deposition costs $1.20.

Get Started

Platform

Resources

Vault Mode (Recommended)

Get Results (Vault Mode)

Direct URL Mode

Parameters

Vault Mode

Direct URL Mode

Shared Options

Get Results (Direct URL Mode)

Status Values

Processing Times

Examples

Deposition with Speaker Labels (Vault Mode)

Court Recording (Direct URL with Webhook)

Supported Formats

Get Started

Platform

Resources

​Vault Mode (Recommended)

​Get Results (Vault Mode)

​Direct URL Mode

​Parameters

​Vault Mode

​Direct URL Mode

​Shared Options

​Get Results (Direct URL Mode)

​Status Values

​Processing Times

​Examples

​Deposition with Speaker Labels (Vault Mode)

​Court Recording (Direct URL with Webhook)

​Supported Formats

Vault Mode (Recommended)

Get Results (Vault Mode)

Direct URL Mode

Parameters

Vault Mode

Direct URL Mode

Shared Options

Get Results (Direct URL Mode)

Status Values

Processing Times

Examples

Deposition with Speaker Labels (Vault Mode)

Court Recording (Direct URL with Webhook)

Supported Formats