Skip to main content
Access 40+ AI models from leading providers through a single API. All models support the OpenAI-compatible chat completions format.
Live Model List: For the complete, up-to-date list of models with current pricing, call the List Models endpoint or visit the Console.

Anthropic Claude

ModelContextMax OutputInput ($/M)Output ($/M)
anthropic/claude-opus-4.5200K32K$15.00$75.00
anthropic/claude-sonnet-4.5200K16K$3.00$15.00
anthropic/claude-haiku-3.5200K8K$0.80$4.00

OpenAI GPT

ModelContextMax OutputInput ($/M)Output ($/M)
openai/gpt-4o128K16K$2.50$10.00
openai/gpt-4o-mini128K16K$0.15$0.60
openai/o1200K100K$15.00$60.00
openai/o1-mini128K65K$3.00$12.00

Google Gemini

ModelContextMax OutputInput ($/M)Output ($/M)
google/gemini-2.0-flash1M8K$0.10$0.40
google/gemini-1.5-pro2M8K$1.25$5.00
google/gemini-1.5-flash1M8K$0.075$0.30

xAI Grok

ModelContextMax OutputInput ($/M)Output ($/M)
xai/grok-2128K8K$2.00$10.00
xai/grok-2-vision32K8K$2.00$10.00

DeepSeek

ModelContextMax OutputInput ($/M)Output ($/M)
deepseek/deepseek-chat64K8K$0.14$0.28
deepseek/deepseek-reasoner64K8K$0.55$2.19

Meta Llama

ModelContextMax OutputInput ($/M)Output ($/M)
meta/llama-3.3-70b128K8K$0.60$0.60
meta/llama-3.1-405b128K8K$3.00$3.00

Mistral

ModelContextMax OutputInput ($/M)Output ($/M)
mistral/mistral-large128K8K$2.00$6.00
mistral/mistral-small32K8K$0.20$0.60
mistral/codestral32K8K$0.30$0.90

All Providers

Case.dev provides access to models from:

Anthropic

Claude family

OpenAI

GPT & o1 family

Google

Gemini family

xAI

Grok family

Meta

Llama family

Mistral

Mistral family

DeepSeek

DeepSeek family

Cohere

Command family

Perplexity

Sonar family

Groq

Fast inference

Together

Open models

Fireworks

Fast inference

Pricing

All prices shown are per million tokens ($/M). Cache pricing is available for supported models.
  • Input: Cost per million input tokens
  • Output: Cost per million output tokens
  • Cache Read: Cost to read from prompt cache (typically 90% discount)
  • Cache Write: Cost to write to prompt cache
Save with Prompt Caching: For repetitive prompts (like system prompts), enable caching to reduce costs by up to 90% on subsequent requests.

Using a Model

To use any model, pass its ID to the chat completions endpoint:
curl https://api.case.dev/llm/v1/chat/completions \
  -H "Authorization: Bearer sk_case_YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-sonnet-4.5",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Programmatic Access

Get the complete model list via API:
curl https://api.case.dev/llm/v1/models \
  -H "Authorization: Bearer sk_case_YOUR_API_KEY"
Response includes all models with current pricing, context windows, and capabilities.