Skip to main content
POST
/
voice
/
v1
/
speak
/
stream
Stream text-to-speech audio
curl --request POST \
  --url https://api.case.dev/voice/v1/speak/stream \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "text": "<string>",
  "voice_id": "EXAVITQu4vr4xnSDxMaL",
  "model_id": "eleven_multilingual_v2",
  "voice_settings": {
    "stability": 0.5,
    "similarity_boost": 0.5,
    "style": 0.5,
    "use_speaker_boost": true
  },
  "language_code": "<string>",
  "output_format": "mp3_44100_128",
  "optimize_streaming_latency": 2,
  "seed": 123,
  "previous_text": "<string>",
  "next_text": "<string>",
  "apply_text_normalization": true,
  "enable_logging": true
}
'
"<string>"

Authorizations

Authorization
string
header
required

API key starting with sk_case_

Body

application/json
text
string
required

Text to convert to speech

Maximum string length: 5000
voice_id
string
default:EXAVITQu4vr4xnSDxMaL

ElevenLabs voice ID (defaults to Rachel for professional clarity)

model_id
enum<string>
default:eleven_multilingual_v2

TTS model to use

Available options:
eleven_monolingual_v1,
eleven_multilingual_v1,
eleven_multilingual_v2,
eleven_turbo_v2
voice_settings
object
language_code
string

Language code (e.g., 'en', 'es', 'fr')

output_format
enum<string>
default:mp3_44100_128

Audio output format

Available options:
mp3_44100_128,
mp3_22050_32,
pcm_16000,
pcm_22050,
pcm_24000,
pcm_44100
optimize_streaming_latency
integer

Optimize for streaming latency (0-4)

Required range: 0 <= x <= 4
seed
integer

Random seed for reproducible generation

previous_text
string

Previous text for context

next_text
string

Next text for context

apply_text_normalization
boolean
default:true

Apply text normalization

enable_logging
boolean
default:true

Enable request logging

Response

Audio stream successfully generated

MP3 audio stream