Get a complete list of all available AI models with pricing, capabilities, and specifications.
Code Examples cURL TypeScript Node.js Python PHP Go Rust Swift
curl -X GET https://api.case.dev/llm/v1/models \
-H "Authorization: Bearer sk_case_your_api_key_here" \
-H "Content-Type: application/json"
curl https://api.case.dev/llm/v1/models \
-H "Authorization: Bearer sk_case_your_api_key_here"
{
"object" : "list" ,
"data" : [
{
"id" : "anthropic/claude-sonnet-4.5" ,
"object" : "model" ,
"created" : 1755815280 ,
"owned_by" : "anthropic" ,
"name" : "Claude Sonnet 4.5" ,
"description" : "Claude Sonnet 4.5 is the newest model..." ,
"context_window" : 200000 ,
"max_tokens" : 64000 ,
"type" : "language" ,
"tags" : [ "file-input" , "reasoning" , "tool-use" , "vision" ],
"pricing" : {
"input" : "0.000003" ,
"output" : "0.000015" ,
"input_cache_read" : "0.0000003" ,
"input_cache_write" : "0.00000375"
}
}
// ... 130+ more models
]
}
id : Model identifier to use in chat completions (e.g., anthropic/claude-sonnet-4.5)name : Human-readable model namedescription : What the model is good atcontext_window : Maximum input tokens the model can handlemax_tokens : Maximum output tokens per requesttype : language for chat models, embedding for embedding modelstags : Capabilities like vision, tool-use, reasoning, file-inputpricing : Cost per token in USD
input: Cost per input tokenoutput: Cost per output tokeninput_cache_read: Cost for reading from cache (if supported)input_cache_write: Cost for writing to cache (if supported)Choosing the right model : Compare capabilities and pricingCost estimation : Calculate expected costs for your use caseFeature discovery : Find models with specific capabilities (vision, tool use, etc.)Send messages to AI models and get intelligent responses. This is the main endpoint for conversational AI.
POST /llm/v1/chat/completions
POST
/llm/v1/chat/completions Execute RequestCode Examples cURL TypeScript Node.js Python PHP Go Rust Swift
curl -X POST https://api.case.dev/llm/v1/chat/completions \
-H "Authorization: Bearer sk_case_your_api_key_here" \
-H "Content-Type: application/json" \
-d '{
"model": "anthropic/claude-4.5-haiku",
"messages": [
{
"role": "user",
"content": "Summarize this deposition in 3 bullet points"
}
],
"max_tokens": 500
}'
curl -X POST https://api.case.dev/llm/v1/chat/completions \
-H "Authorization: Bearer sk_case_your_api_key_here" \
-H "Content-Type: application/json" \
-d '{
"model": "anthropic/claude-4.5-haiku",
"messages": [
{"role": "user", "content": "Summarize this deposition in 3 bullet points"}
],
"max_tokens": 500
}'
{
"id" : "gen_01K972J7KV4Y0MJZ3SRTA6YYMH" ,
"object" : "chat.completion" ,
"created" : 1762247909 ,
"model" : "anthropic/claude-4.5-haiku" ,
"choices" : [
{
"index" : 0 ,
"message" : {
"role" : "assistant" ,
"content" : "Here are the key points: \n\n • Witness testified about... \n • Documents reviewed include... \n • Timeline established from..." ,
"provider_metadata" : {
"anthropic" : {
"usage" : {
"input_tokens" : 245 ,
"output_tokens" : 87 ,
"cache_creation_input_tokens" : 0 ,
"cache_read_input_tokens" : 0
}
}
}
},
"logprobs" : null ,
"finish_reason" : "stop"
}
],
"usage" : {
"prompt_tokens" : 245 ,
"completion_tokens" : 87 ,
"total_tokens" : 332 ,
"cost" : 0.000105 ,
"market_cost" : 0.000105 ,
"is_byok" : true
}
}
Required:
model (string): Model ID from /llm/v1/models (e.g., anthropic/claude-sonnet-4.5)messages (array): Conversation history
role: user, assistant, or systemcontent: Message text or multimodal contentOptional:
max_tokens (number): Maximum tokens to generate (default: 4096)temperature (number): Randomness, 0-2 (default: 1)top_p (number): Nucleus sampling, 0-1 (default: 1)stream (boolean): Stream responses token-by-token (default: false)stop (array): Stop sequences to end generationpresence_penalty (number): Penalize repeated topics, -2 to 2frequency_penalty (number): Penalize repeated tokens, -2 to 2Build context by including previous messages:
{
"model" : "openai/gpt-5" ,
"messages" : [
{ "role" : "user" , "content" : "What is a deposition?" },
{ "role" : "assistant" , "content" : "A deposition is sworn testimony..." },
{ "role" : "user" , "content" : "How long do they typically last?" }
],
"max_tokens" : 200
}
Set behavior and context with system messages:
{
"model" : "anthropic/claude-sonnet-4.5" ,
"messages" : [
{
"role" : "system" ,
"content" : "You are a legal assistant specializing in medical malpractice cases. Be concise and cite relevant case law when possible."
},
{
"role" : "user" ,
"content" : "Review this medical record and identify potential issues of negligence."
}
]
}
Send images along with text (works with models tagged with vision):
{
"model" : "anthropic/claude-sonnet-4.5" ,
"messages" : [
{
"role" : "user" ,
"content" : [
{ "type" : "text" , "text" : "What medical equipment is visible in this photo?" },
{
"type" : "image_url" ,
"image_url" : { "url" : "https://example.com/hospital-room.jpg" }
}
]
}
]
}
Get responses token-by-token as they're generated:
curl -X POST https://api.case.dev/llm/v1/chat/completions \
-H "Authorization: Bearer sk_case_your_api_key_here" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-5-mini",
"messages": [{"role": "user", "content": "Write a detailed case summary"}],
"stream": true
}' \
--no-buffer
Response format (Server-Sent Events):
data: {"id":"gen_123","choices":[{"delta":{"content":"The"}}]}
data: {"id":"gen_123","choices":[{"delta":{"content":" case"}}]}
data: {"id":"gen_123","choices":[{"delta":{"content":" involves"}}]}
data: [DONE]
Document Analysis:
{
"model" : "anthropic/claude-sonnet-4.5" ,
"messages" : [
{
"role" : "system" ,
"content" : "Extract key facts, dates, and parties from legal documents."
},
{
"role" : "user" ,
"content" : "Document text here..."
}
],
"max_tokens" : 2000
}
Deposition Summarization:
{
"model" : "openai/gpt-5" ,
"messages" : [
{
"role" : "user" ,
"content" : "Summarize this 300-page deposition transcript, focusing on admissions and inconsistencies: \n\n [transcript text]"
}
],
"max_tokens" : 4000
}
Medical Record Review:
{
"model" : "anthropic/claude-opus-4.1" ,
"messages" : [
{
"role" : "system" ,
"content" : "You are a medical-legal expert. Identify standard-of-care deviations and temporal inconsistencies."
},
{
"role" : "user" ,
"content" : "[medical records text]"
}
],
"max_tokens" : 5000
}
Monitor costs in real-time using the usage object in responses:
{
"usage" : {
"prompt_tokens" : 1245 ,
"completion_tokens" : 387 ,
"total_tokens" : 1632 ,
"cost" : 0.004896 , // Your actual cost
"market_cost" : 0.004896 , // Market rate cost
"is_byok" : true // Using your own API keys
}
}
Cost calculation:
cost = (input_tokens × input_price) + (output_tokens × output_price)Prices are per token (see /llm/v1/models for rates) Convert text into numerical vectors for semantic search, clustering, and similarity comparisons.
Code Examples cURL TypeScript Node.js Python PHP Go Rust Swift
curl -X POST https://api.case.dev/llm/v1/embeddings \
-H "Authorization: Bearer sk_case_your_api_key_here" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/text-embedding-3-small",
"input": "Plaintiff alleges negligence in post-operative care"
}'
curl -X POST https://api.case.dev/llm/v1/embeddings \
-H "Authorization: Bearer sk_case_your_api_key_here" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/text-embedding-3-small",
"input": "Plaintiff alleges negligence in post-operative care"
}'
{
"object" : "list" ,
"data" : [
{
"object" : "embedding" ,
"index" : 0 ,
"embedding" : [
-0.016882256 ,
0.02250519 ,
-0.011252595
// ... 1536 dimensions total
]
}
],
"model" : "openai/text-embedding-3-small" ,
"usage" : {
"prompt_tokens" : 12 ,
"total_tokens" : 12
}
}
Model Dimensions Use Case Cost per 1K tokens openai/text-embedding-3-small1536 General purpose, fast $0.00002 openai/text-embedding-3-large3072 Higher quality $0.00013 voyage/voyage-law-21024 Legal documents (optimized) $0.00012 voyage/voyage-3.51536 General purpose $0.00006 cohere/embed-v4.01024 Multilingual $0.00012
Embed multiple texts in one request (more efficient):
{
"model" : "openai/text-embedding-3-small" ,
"input" : [
"Medical record from January 2024" ,
"Deposition transcript page 45" ,
"Expert witness report summary"
]
}
Response:
{
"data" : [
{ "index" : 0 , "embedding" : [ ... ]},
{ "index" : 1 , "embedding" : [ ... ]},
{ "index" : 2 , "embedding" : [ ... ]},
]
}
Semantic Document Search:
Embed all your case documents Store embeddings in a vector database When searching, embed the query Find documents with similar embeddings Case Clustering:
Group similar cases by embedding case summaries Find patterns across depositions Identify related medical incidents Document Similarity:
# Embed two documents
curl -X POST https://api.case.dev/llm/v1/embeddings \
-H "Authorization: Bearer sk_case_your_api_key_here" \
-H "Content-Type: application/json" \
-d '{
"model": "voyage/voyage-law-2",
"input": [
"Plaintiff expert testimony regarding standard of care",
"Defense expert rebuttal on treatment protocols"
]
}'
# Calculate cosine similarity between the two embeddings
# Similarity close to 1 = very similar, close to 0 = different
Smart Contract Review:
Embed contract clauses Find similar precedents Identify unusual terms by comparing to standard embeddings