Transcription

Speech-to-text transcription with speaker diarization and PII redaction

Create Transcription

Submit audio or video files for transcription. Supports 100+ languages with speaker diarization, PII redaction, and advanced features.

Endpoint

POST /voice/transcription

API Key

POST

/voice/transcription

Request Body

Code Examples

curl -X POST https://api.case.dev/voice/transcription \
  -H "Authorization: Bearer sk_case_your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{
  "audio_url": "https://your-storage.com/deposition-audio.m4a",
  "speaker_labels": true,
  "language_code": "en"
}'

Example Request

curl -X POST https://api.case.dev/voice/transcription \
  -H "Authorization: Bearer sk_case_your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "audio_url": "https://your-storage.com/deposition-audio.m4a",
    "speaker_labels": true,
    "language_code": "en"
  }'

Example Response

{
  "id": "5f9c5c3e-1234-5678-9abc-def012345678",
  "status": "queued",
  "audio_url": "https://your-storage.com/deposition-audio.m4a",
  "language_code": "en",
  "speaker_labels": true
}

Request Parameters

Required:

audio_url (string): Publicly accessible URL to your audio/video file
- Supports: M4A, MP3, MP4, WAV, FLAC, OGG, WebM, and more
- Max file size: 5GB
- Max duration: 7 hours

Optional:

language_code (string): Language for transcription (default: auto-detect)
- en - English
- es - Spanish
- fr - French
- de - German
- pt - Portuguese
- zh - Chinese
- ja - Japanese
- ... 100+ languages supported
speaker_labels (boolean): Enable speaker diarization (default: false)
- Identifies different speakers as "Speaker A", "Speaker B", etc.
- Perfect for depositions, interviews, meetings
auto_highlights (boolean): Automatically detect key phrases (default: false)
- Identifies important moments in the audio
content_safety_labels (boolean): Detect sensitive content (default: false)
- Flags potentially sensitive topics
redact_pii (boolean): Redact personally identifiable information (default: false)
- Removes names, addresses, SSNs, credit cards, etc.
- Essential for HIPAA compliance
redact_pii_policies (array): Specific PII types to redact
- Options: name, address, email, phone_number, ssn, credit_card, date_of_birth, medical, bank_account
webhook_url (string): URL to receive completion notification
- Webhook called when transcription completes
- Recommended for long audio files
language_detection (boolean): Detect language automatically (default: false)
- Useful for multilingual audio

Get Transcription Status

Retrieve transcription status and completed transcript.

Endpoint

GET /voice/transcription/:id

API Key

GET

/voice/transcription/5f9c5c3e-1234-5678-9abc-def012345678

Code Examples

curl -X GET https://api.case.dev/voice/transcription/5f9c5c3e-1234-5678-9abc-def012345678 \
  -H "Authorization: Bearer sk_case_your_api_key_here" \
  -H "Content-Type: application/json"

Example Request

curl https://api.case.dev/voice/transcription/5f9c5c3e-1234-5678-9abc-def012345678 \
  -H "Authorization: Bearer sk_case_your_api_key_here"

Example Response (Processing)

{
  "id": "5f9c5c3e-1234-5678-9abc-def012345678",
  "status": "processing",
  "audio_url": "https://your-storage.com/deposition-audio.m4a",
  "audio_duration": 3847.2
}

Example Response (Completed)

{
  "id": "5f9c5c3e-1234-5678-9abc-def012345678",
  "status": "completed",
  "audio_url": "https://your-storage.com/deposition-audio.m4a",
  "text": "Speaker A: Please state your name for the record. Speaker B: My name is Dr. Sarah Johnson...",
  "words": [
    {
      "text": "Please",
      "start": 100,
      "end": 350,
      "confidence": 0.99,
      "speaker": "A"
    },
    {
      "text": "state",
      "start": 400,
      "end": 650,
      "confidence": 0.98,
      "speaker": "A"
    }
  ],
  "utterances": [
    {
      "text": "Please state your name for the record.",
      "start": 100,
      "end": 2450,
      "confidence": 0.97,
      "speaker": "A"
    },
    {
      "text": "My name is Dr. Sarah Johnson.",
      "start": 3100,
      "end": 5200,
      "confidence": 0.96,
      "speaker": "B"
    }
  ],
  "audio_duration": 3847.2,
  "confidence": 0.95,
  "language_code": "en"
}

Status Values

queued: Job accepted, waiting to start
processing: Transcription in progress
completed: Finished successfully
error: Failed (check error message)

Response Fields (Completed)

text (string): Full transcript with speaker labels
words (array): Word-level timing and confidence
- text: The word
- start: Start time in milliseconds
- end: End time in milliseconds
- confidence: Accuracy score (0-1)
- speaker: Speaker label if diarization enabled
utterances (array): Sentence-level speaker turns
- Groups words into complete sentences per speaker
audio_duration (number): Duration in seconds
confidence (number): Overall transcription confidence
language_code (string): Detected or specified language

Legal-Specific Use Cases

Deposition Transcription

curl -X POST https://api.case.dev/voice/transcription \
  -H "Authorization: Bearer sk_case_..." \
  -H "Content-Type: application/json" \
  -d '{
    "audio_url": "https://vault.s3.amazonaws.com/deposition-2024-1234.m4a",
    "speaker_labels": true,
    "language_code": "en",
    "redact_pii": true,
    "redact_pii_policies": ["name", "address", "ssn", "medical"],
    "webhook_url": "https://your-app.com/transcription-complete"
  }'

Perfect for:

Depositions with multiple speakers
Witness interviews
Client consultations
Court proceedings

Medical Record Audio Notes

curl -X POST https://api.case.dev/voice/transcription \
  -H "Authorization: Bearer sk_case_..." \
  -H "Content-Type: application/json" \
  -d '{
    "audio_url": "https://storage.com/doctor-notes.mp3",
    "redact_pii": true,
    "redact_pii_policies": ["name", "medical", "date_of_birth"],
    "content_safety_labels": true
  }'

HIPAA Compliant:

Automatically redacts PHI
Flags sensitive medical topics
Maintains compliance logs

Multilingual Witness Interviews

curl -X POST https://api.case.dev/voice/transcription \
  -H "Authorization: Bearer sk_case_..." \
  -H "Content-Type: application/json" \
  -d '{
    "audio_url": "https://storage.com/spanish-interview.m4a",
    "language_detection": true,
    "speaker_labels": true
  }'

Supports:

Automatic language detection
100+ languages including Spanish, Mandarin, Arabic
Speaker labels work across languages

Processing Times

Audio Duration	Typical Processing Time
5 minutes	30-60 seconds
30 minutes	2-4 minutes
1 hour	5-8 minutes
3 hours	15-25 minutes

Processing speed:

~0.15x realtime (1 hour audio = ~9 minutes processing)
Higher accuracy than real-time transcription
Use webhooks for files over 30 minutes

Polling for Completion

For shorter audio files without webhooks:

#!/bin/bash
TRANSCRIPT_ID="5f9c5c3e-1234-5678-9abc-def012345678"

while true; do
  RESPONSE=$(curl -s https://api.case.dev/voice/transcription/$TRANSCRIPT_ID \
    -H "Authorization: Bearer sk_case_...")

  STATUS=$(echo "$RESPONSE" | jq -r '.status')
  echo "Status: $STATUS"

  if [ "$STATUS" = "completed" ]; then
    echo "Transcription complete!"
    echo "$RESPONSE" | jq -r '.text' > transcript.txt
    break
  elif [ "$STATUS" = "error" ]; then
    echo "Transcription failed"
    break
  fi

  sleep 5
done

Webhook Notifications

When transcription completes, we POST to your webhook_url:

{
  "transcript_id": "5f9c5c3e-1234-5678-9abc-def012345678",
  "status": "completed",
  "text": "Full transcript text...",
  "audio_duration": 1847.3,
  "confidence": 0.96
}

Webhook verification:

Includes X-Signature header with HMAC-SHA256
Verify requests are from CaseMark

Pricing

Per-minute pricing:

Voice transcription: $0.30 per minute ($18.00 per hour)

Example costs:

1-hour deposition: $18.00
3-hour medical interview: $54.00
30-minute client call: $9.00

No additional charges for:

Language detection
Multiple languages
Webhook delivery
Word-level timestamps

Supported Audio Formats

M4A (recommended for iOS recordings)
MP3 (universal compatibility)
MP4 (video files - audio extracted)
WAV (uncompressed, highest quality)
FLAC (lossless compression)
OGG/Opus (web optimized)
WebM (browser recordings)
AMR (phone recordings)

Video formats supported:

MP4, MOV, AVI, MKV (audio extracted automatically)

Accuracy & Features

Industry-leading accuracy:

95%+ for clear audio
90%+ for phone/courtroom recordings
Works with background noise, accents, technical jargon

Advanced features:

Speaker diarization: Identify who said what
PII redaction: HIPAA/GDPR compliant
100+ languages: Auto-detect or specify
Custom vocabulary: Coming soon for legal terms
Paragraph formatting: Natural text structure
Timestamps: Word and sentence level

Vault Integration

Transcribe audio files stored in vaults without downloading. The transcription API accepts S3 URLs directly for seamless integration.

Using Presigned URLs from Vault

# Get vault object with audio file
VAULT_ID="sytp1b5f5j1yuj7uffzzxgw6"
OBJECT_ID="audio123"

# Get presigned download URL (valid for 1 hour)
DOWNLOAD_URL=$(curl -s https://api.case.dev/vault/$VAULT_ID/objects/$OBJECT_ID \
  -H "Authorization: Bearer sk_case_..." \
  | jq -r '.downloadUrl')

# Submit for transcription
curl -X POST https://api.case.dev/voice/transcription \
  -H "Authorization: Bearer sk_case_..." \
  -H "Content-Type: application/json" \
  -d "{
    \"audio_url\": \"$DOWNLOAD_URL\",
    \"speaker_labels\": true,
    \"language_code\": \"en\"
  }"

Using Long-Lived Presigned URLs

For audio files that take longer to process, generate a presigned URL with extended expiry:

# Generate 24-hour presigned URL
PRESIGNED_RESPONSE=$(curl -s -X POST https://api.case.dev/vault/$VAULT_ID/objects/$OBJECT_ID/presigned-url \
  -H "Authorization: Bearer sk_case_..." \
  -H "Content-Type: application/json" \
  -d '{"operation": "GET", "expiresIn": 86400}')

AUDIO_URL=$(echo "$PRESIGNED_RESPONSE" | jq -r '.presignedUrl')

# Submit for transcription
curl -X POST https://api.case.dev/voice/transcription \
  -H "Authorization: Bearer sk_case_..." \
  -d "{\"audio_url\": \"$AUDIO_URL\", \"speaker_labels\": true}"

Complete Workflow: Upload → Transcribe → Store

#!/bin/bash
API_KEY="sk_case_your_api_key_here"
VAULT_ID="sytp1b5f5j1yuj7uffzzxgw6"
AUDIO_FILE="deposition-recording.m4a"

# Step 1: Upload audio to vault
UPLOAD_RESPONSE=$(curl -s -X POST https://api.case.dev/vault/$VAULT_ID/upload \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d "{
    \"filename\": \"$AUDIO_FILE\",
    \"contentType\": \"audio/m4a\",
    \"metadata\": {
      \"case\": \"2024-CV-1234\",
      \"type\": \"deposition\",
      \"date\": \"2024-11-10\"
    }
  }")

OBJECT_ID=$(echo "$UPLOAD_RESPONSE" | jq -r '.objectId')
UPLOAD_URL=$(echo "$UPLOAD_RESPONSE" | jq -r '.uploadUrl')

# Upload the file
curl -X PUT "$UPLOAD_URL" \
  -H "Content-Type: audio/m4a" \
  --data-binary "@$AUDIO_FILE"

echo "✓ Audio uploaded to vault: $OBJECT_ID"

# Step 2: Get download URL
DOWNLOAD_URL=$(curl -s https://api.case.dev/vault/$VAULT_ID/objects/$OBJECT_ID \
  -H "Authorization: Bearer $API_KEY" \
  | jq -r '.downloadUrl')

# Step 3: Submit for transcription
TRANSCRIPT_RESPONSE=$(curl -s -X POST https://api.case.dev/voice/transcription \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d "{
    \"audio_url\": \"$DOWNLOAD_URL\",
    \"speaker_labels\": true,
    \"language_code\": \"en\",
    \"redact_pii\": true
  }")

TRANSCRIPT_ID=$(echo "$TRANSCRIPT_RESPONSE" | jq -r '.id')
echo "✓ Transcription started: $TRANSCRIPT_ID"

# Step 4: Poll for completion
while true; do
  STATUS_RESPONSE=$(curl -s https://api.case.dev/voice/transcription/$TRANSCRIPT_ID \
    -H "Authorization: Bearer $API_KEY")

  STATUS=$(echo "$STATUS_RESPONSE" | jq -r '.status')
  echo "Transcription status: $STATUS"

  if [ "$STATUS" = "completed" ]; then
    echo "✓ Transcription complete!"

    # Save transcript
    echo "$STATUS_RESPONSE" | jq -r '.text' > transcript.txt

    # Step 5: Upload transcript back to vault
    TRANSCRIPT_UPLOAD=$(curl -s -X POST https://api.case.dev/vault/$VAULT_ID/upload \
      -H "Authorization: Bearer $API_KEY" \
      -H "Content-Type: application/json" \
      -d "{
        \"filename\": \"${AUDIO_FILE%.m4a}-transcript.txt\",
        \"contentType\": \"text/plain\",
        \"metadata\": {
          \"source_audio_id\": \"$OBJECT_ID\",
          \"transcript_id\": \"$TRANSCRIPT_ID\"
        }
      }")

    TRANSCRIPT_OBJECT_ID=$(echo "$TRANSCRIPT_UPLOAD" | jq -r '.objectId')
    TRANSCRIPT_UPLOAD_URL=$(echo "$TRANSCRIPT_UPLOAD" | jq -r '.uploadUrl')

    curl -X PUT "$TRANSCRIPT_UPLOAD_URL" \
      -H "Content-Type: text/plain" \
      --data-binary "@transcript.txt"

    echo "✓ Transcript uploaded to vault: $TRANSCRIPT_OBJECT_ID"
    break
  elif [ "$STATUS" = "error" ]; then
    echo "✗ Transcription failed"
    exit 1
  fi

  sleep 10
done

echo ""
echo "=== Complete! ==="
echo "Audio in vault: $OBJECT_ID"
echo "Transcript in vault: $TRANSCRIPT_OBJECT_ID"
echo "Transcript ID: $TRANSCRIPT_ID"

Key Benefits

No Downloads Required

Transcription service accesses audio directly from S3
Eliminates local file handling

Secure

Presigned URLs expire automatically
Audio files stay encrypted in vault

Integrated Workflow

Upload → Transcribe → Store all in one platform
Keep audio and transcripts together

Cost Effective

Avoid S3 egress charges
Pay only for transcription time

Streaming Transcription

Real-time speech-to-text for live audio streams via WebSocket. Get transcripts as you speak with ultra-low latency.

Endpoint

wss://casemark-ai--websocket-stream-helper-fastapi-app.modal.run/ws?token=sk_case_your_api_key_here

Features

Ultra-Fast Transcription

300ms P50 latency on word emission
91% word accuracy rate
Intelligent endpointing for turn detection

Pricing

$0.30 per minute ($18.00 per hour)
Same rate as async transcription
Unlimited concurrent streams
No setup fees or minimums

Use Cases

Live deposition transcription with real-time captions
Phone call transcription as conversations happen
Court proceeding transcription with live display
Voice agent applications with immediate feedback

Connect to Streaming

WebSocket Connection

Connect via WebSocket with your API key in the query string:

const token = 'sk_case_your_api_key_here';
const ws = new WebSocket(`wss://casemark-ai--websocket-stream-helper-fastapi-app.modal.run/ws?token=${token}`);

ws.onopen = () => {
  console.log('Connected to streaming transcription');
};

ws.onmessage = (event) => {
  const data = JSON.parse(event.data);
  console.log('Transcript:', data);
};

ws.onerror = (error) => {
  console.error('WebSocket error:', error);
};

ws.onclose = (event) => {
  console.log('Connection closed:', event.code, event.reason);
};

Authentication

Pass your API key as a query parameter:

Parameter: token
Format: ?token=sk_case_your_api_key_here
Required: Yes

The WebSocket connection will be rejected if:

No token is provided
Token is invalid or expired
API key doesn't have voice, transcription, or streaming permissions

Send Audio Data

Audio Format Requirements

Required format:

Encoding: PCM signed 16-bit little-endian
Sample rate: 16,000 Hz (16kHz)
Channels: Mono (1 channel)

Send Audio Frames

Send raw audio bytes as binary WebSocket messages:

// Example: Send audio from microphone
navigator.mediaDevices.getUserMedia({ audio: true })
  .then(stream => {
    const mediaRecorder = new MediaRecorder(stream);
    const audioContext = new AudioContext({ sampleRate: 16000 });
    
    mediaRecorder.ondataavailable = async (event) => {
      const audioData = await event.data.arrayBuffer();
      
      // Convert to PCM 16-bit if needed
      const pcmData = convertToPCM16(audioData);
      
      // Send to WebSocket
      if (ws.readyState === WebSocket.OPEN) {
        ws.send(pcmData);
      }
    };
    
    mediaRecorder.start(100); // Send every 100ms
  });

Audio Chunk Size

Recommended: 100-250ms chunks
Minimum: 50ms
Maximum: 1000ms

Smaller chunks = lower latency, but more overhead.

Receive Transcripts

Message Types

The WebSocket will send JSON messages with different types:

1. Session Begins

{
  "type": "session_begins",
  "session_id": "abc123",
  "message": "Streaming session started"
}

2. Partial Transcript (Real-time)

{
  "message_type": "PartialTranscript",
  "text": "Hello, this is a test",
  "audio_start": 0,
  "audio_end": 2000,
  "confidence": 0.95,
  "words": [
    {
      "text": "Hello",
      "start": 0,
      "end": 400,
      "confidence": 0.98
    },
    {
      "text": "this",
      "start": 400,
      "end": 600,
      "confidence": 0.97
    }
  ]
}

Partial transcripts are interim results that may change as more audio is processed.

3. Final Transcript (Immutable)

{
  "message_type": "FinalTranscript",
  "text": "Hello, this is a test.",
  "audio_start": 0,
  "audio_end": 2500,
  "confidence": 0.96,
  "punctuated": true,
  "words": [
    {
      "text": "Hello",
      "start": 0,
      "end": 400,
      "confidence": 0.98
    },
    {
      "text": "this",
      "start": 400,
      "end": 600,
      "confidence": 0.97
    },
    {
      "text": "is",
      "start": 600,
      "end": 750,
      "confidence": 0.96
    },
    {
      "text": "a",
      "start": 750,
      "end": 850,
      "confidence": 0.95
    },
    {
      "text": "test",
      "start": 850,
      "end": 2500,
      "confidence": 0.97
    }
  ]
}

Final transcripts are immutable and won't change. Use these for official records.

4. Session Terminated

{
  "message_type": "SessionTerminated"
}

Sent when the session ends (either by client or server).

End Session

Graceful Termination

Send a JSON message to end the session cleanly:

ws.send(JSON.stringify({ terminate: true }));

// Wait a moment for final transcripts
setTimeout(() => {
  ws.close();
}, 1000);

Automatic Timeout

Sessions automatically end after:

5 minutes of silence (no audio data received)
Connection errors
Client disconnect

Complete Example

Node.js Client

const WebSocket = require('ws');
const fs = require('fs');

const token = 'sk_case_your_api_key_here';
const ws = new WebSocket(`wss://casemark-ai--websocket-stream-helper-fastapi-app.modal.run/ws?token=${token}`);

ws.on('open', () => {
  console.log('✓ Connected to streaming transcription');
  
  // Read audio file (16kHz, PCM 16-bit, mono)
  const audioFile = fs.readFileSync('./audio.raw');
  
  // Send in chunks (100ms at 16kHz = 3200 bytes)
  const chunkSize = 3200;
  let offset = 0;
  
  const sendChunk = () => {
    if (offset < audioFile.length) {
      const chunk = audioFile.slice(offset, offset + chunkSize);
      ws.send(chunk);
      offset += chunkSize;
      setTimeout(sendChunk, 100); // Send every 100ms
    } else {
      // End of audio
      console.log('✓ Audio sent, waiting for final transcripts...');
      ws.send(JSON.stringify({ terminate: true }));
      setTimeout(() => ws.close(), 2000);
    }
  };
  
  sendChunk();
});

ws.on('message', (data) => {
  const message = JSON.parse(data.toString());
  
  if (message.message_type === 'FinalTranscript') {
    console.log('Final:', message.text);
  } else if (message.message_type === 'PartialTranscript') {
    console.log('Partial:', message.text);
  } else {
    console.log('Message:', message);
  }
});

ws.on('error', (error) => {
  console.error('WebSocket error:', error);
});

ws.on('close', (code, reason) => {
  console.log(`Connection closed: ${code} ${reason}`);
});

Browser Client

// Get microphone access
const stream = await navigator.mediaDevices.getUserMedia({
  audio: {
    sampleRate: 16000,
    channelCount: 1,
    echoCancellation: true,
    noiseSuppression: true,
  }
});

const token = 'sk_case_your_api_key_here';
const ws = new WebSocket(`wss://casemark-ai--websocket-stream-helper-fastapi-app.modal.run/ws?token=${token}`);

// Set up AudioWorklet for PCM processing
const audioContext = new AudioContext({ sampleRate: 16000 });
const source = audioContext.createMediaStreamSource(stream);

await audioContext.audioWorklet.addModule('/audio-processor.js');
const processor = new AudioWorkletNode(audioContext, 'pcm-processor');

processor.port.onmessage = (event) => {
  // Send PCM data to WebSocket
  if (ws.readyState === WebSocket.OPEN) {
    ws.send(event.data);
  }
};

source.connect(processor);
processor.connect(audioContext.destination);

// Handle transcripts
ws.onmessage = (event) => {
  const data = JSON.parse(event.data);
  
  if (data.message_type === 'FinalTranscript') {
    document.getElementById('transcript').textContent += data.text + ' ';
  } else if (data.message_type === 'PartialTranscript') {
    document.getElementById('interim').textContent = data.text;
  }
};

// Stop recording
document.getElementById('stop').onclick = () => {
  ws.send(JSON.stringify({ terminate: true }));
  stream.getTracks().forEach(track => track.stop());
  audioContext.close();
  ws.close();
};

Error Handling

Error Messages

The WebSocket may send error messages:

{
  "error": "Authentication failed",
  "code": "AUTH_INVALID"
}

Common error codes:

AUTH_INVALID - Invalid or missing API key
PERMISSION_DENIED - API key lacks streaming permissions
SERVICE_UNAVAILABLE - Streaming service temporarily down
UPSTREAM_ERROR - AssemblyAI service error
PROCESSING_ERROR - Failed to process audio data
SESSION_NOT_FOUND - No active session

Close Codes

WebSocket close codes indicate why the connection ended:

Code	Reason	Description
1000	Normal closure	Session ended normally
1008	Policy violation	Authentication failed or insufficient perms
1011	Internal error	Server error (temporary)
4000	Bad audio format	Audio doesn't meet requirements
4001	Rate limit exceeded	Too many concurrent connections

Best Practices

Audio Quality

Optimize for accuracy:

Use noise cancellation when capturing microphone input
Minimize background noise
Use high-quality microphones for depositions
Test with your specific audio setup

Latency Optimization

Minimize end-to-end latency:

Send smaller chunks (100ms) for real-time display
Use wired internet connection (not WiFi when possible)
Host your application close to your users
Process partial transcripts for immediate feedback

Error Recovery

Handle transient failures:

let reconnectAttempts = 0;
const maxReconnects = 3;

function connect() {
  const ws = new WebSocket(`wss://casemark-ai--websocket-stream-helper-fastapi-app.modal.run/ws?token=${token}`);
  
  ws.onclose = (event) => {
    if (event.code !== 1000 && reconnectAttempts < maxReconnects) {
      reconnectAttempts++;
      console.log(`Reconnecting... (${reconnectAttempts}/${maxReconnects})`);
      setTimeout(connect, 1000 * reconnectAttempts);
    }
  };
  
  ws.onopen = () => {
    reconnectAttempts = 0; // Reset on successful connection
  };
  
  return ws;
}

Usage Tracking

Monitor your usage:

Track connection duration to estimate costs
Implement automatic disconnection after inactivity
Set session time limits for budget control
Monitor concurrent connections

const startTime = Date.now();

ws.onclose = () => {
  const durationSeconds = (Date.now() - startTime) / 1000;
  const cost = (durationSeconds / 60) * 0.30; // $0.30 per minute
  console.log(`Session duration: ${durationSeconds}s, Est. cost: $${cost.toFixed(2)}`);
};

Comparison: Async vs Streaming

Feature	Async Transcription	Streaming Transcription
Latency	Minutes	300ms
Protocol	HTTP REST	WebSocket
Use Case	Pre-recorded files	Live audio
Pricing	$0.30/minute	$0.30/minute
Input	Audio URL	Raw audio stream
Output	Complete transcript	Progressive transcripts
Speaker Labels	✓ Yes	Coming soon
Auto Highlights	✓ Yes	✗ No
Content Safety	✓ Yes	✗ No

When to use async:

Transcribing pre-recorded depositions
Batch processing multiple files
Need speaker diarization or advanced features

When to use streaming:

Live courtroom transcription
Real-time phone call transcription
Voice assistant applications
Interactive voice agents

Integrations

Text-to-Speech

Convert text to professional audio with AI voices

On This Page

Create Transcription
Get Transcription Status
Legal-Specific Use Cases
Processing Times
Polling for Completion
Webhook Notifications
Pricing
Supported Audio Formats
Accuracy & Features
Vault Integration
Streaming Transcription
- Endpoint
- Features
Connect to Streaming
- WebSocket Connection
- Authentication
Send Audio Data
Receive Transcripts
- Message Types
End Session
- Graceful Termination
- Automatic Timeout
Complete Example
- Node.js Client
- Browser Client
Error Handling
- Error Messages
- Close Codes
Best Practices
Comparison: Async vs Streaming