Model Selection

Choosing the right model for your use case

Choosing the Right Model

For Legal Document Analysis

Best: anthropic/claude-sonnet-4.5 - Excellent at long documents, nuanced reasoning
Fast: anthropic/claude-4.5-haiku - Quick summaries, fast turnaround
Deep Analysis: anthropic/claude-opus-4.1 - Most thorough, best for complex cases

For Coding/Structured Output

Best: openai/gpt-5-codex - Optimized for code generation
Fast: deepseek/deepseek-v3.1 - Great reasoning, very affordable

For Cost-Sensitive Tasks

Ultra-cheap: alibaba/qwen-3-32b - $0.0001/1M input tokens
Balanced: deepseek/deepseek-v3.1 - $0.0003/1M tokens (excellent quality)
Fast & Cheap: xai/grok-4-fast-non-reasoning - $0.0002/1M tokens

For Medical Records

Best: anthropic/claude-opus-4.1 - Understands medical terminology deeply
Vision: anthropic/claude-sonnet-4.5 - Can read charts, scans, handwriting
Budget: google/gemini-2.5-flash - Good reasoning at lower cost

For Reasoning Tasks

Best: openai/gpt-5-pro - Deepest thinking, multi-hour reasoning
Fast: openai/o4-mini - Quick reasoning for math/logic
Open: deepseek/deepseek-r1 - Transparent reasoning traces

Understanding Response Metadata

Provider Metadata

The API routes requests through multiple providers for reliability:

{
  "provider_metadata": {
    "gateway": {
      "routing": {
        "originalModelId": "anthropic/claude-4.5-haiku",
        "resolvedProvider": "anthropic",
        "fallbacksAvailable": ["vertexAnthropic", "bedrock"],
        "finalProvider": "anthropic",
        "attempts": [
          {
            "provider": "anthropic",
            "credentialType": "byok",
            "success": true,
            "statusCode": 200
          }
        ]
      },
      "cost": "0.0000336",
      "marketCost": "0.0000336"
    }
  }
}

originalModelId: The model you requested
resolvedProvider: Which provider fulfilled the request
fallbacksAvailable: Backup providers if primary fails
credentialType: byok (bring your own key) or system (CaseMark's keys)
attempts: Shows retry history if there were failures

Finish Reasons

stop: Model naturally completed the response
length: Hit max_tokens limit (response may be incomplete)
content_filter: Response blocked by safety filters
tool_calls: Model wants to call a function (for tool use)

Model Selection Cheat Sheet

Task	Recommended Model	Why
Quick Q&A	`anthropic/claude-4.5-haiku`	Fast, cheap, smart enough
Deep legal analysis	`anthropic/claude-opus-4.1`	Best reasoning for complex cases
Medical records	`anthropic/claude-sonnet-4.5` + vision	Can read charts, handwriting
Contract review	`openai/gpt-5`	Excellent at structured extraction
Deposition summary	`anthropic/claude-sonnet-4.5`	Great at long docs, nuance
Cost-sensitive	`deepseek/deepseek-v3.1`	10x cheaper, still very good
Coding/automation	`openai/gpt-5-codex`	Optimized for code
Multi-hour reasoning	`openai/gpt-5-pro`	Can think for hours on hard problems
Embeddings (general)	`openai/text-embedding-3-small`	Fast, cheap, good quality
Embeddings (legal)	`voyage/voyage-law-2`	Optimized for legal text

LLMs

Large language model integrations - Access top AI models through one unified API.

API Reference

On This Page

Choosing the Right Model
Understanding Response Metadata
- Provider Metadata
- Finish Reasons
Model Selection Cheat Sheet