Rate Limits & Usage

Case.dev uses a tiered rate limiting system inspired by OpenAI. Your rate limits automatically increase as you use the platform and build trust.

How Rate Limits Work

Requests Per Minute (RPM) - Each service has its own rate limit
Per-Organization - Limits apply to your entire organization, not individual API keys
Tier-Based - Your tier determines your limits (see below)
Automatic Upgrades - Tiers increase automatically based on spend and account age

Rate Limit Headers

Every API response includes rate limit information:

x-ratelimit-limit-requests: 600
x-ratelimit-remaining-requests: 599
x-ratelimit-reset-requests: 45s
x-ratelimit-tier: tier_1

Header	Description
`x-ratelimit-limit-requests`	Maximum requests per minute for this service
`x-ratelimit-remaining-requests`	Requests remaining in current window
`x-ratelimit-reset-requests`	Time until limit resets
`x-ratelimit-tier`	Your current tier

Tier System

Your tier is determined by lifetime spend and account age:

Tier	Spend Required	Account Age	Example Limits
Free	$0	Any	100 LLM RPM, 50 OCR RPM
Tier 1	$10+	Any	600 LLM RPM, 200 OCR RPM
Tier 2	$50+	7+ days	2,000 LLM RPM, 600 OCR RPM
Tier 3	$100+	14+ days	5,000 LLM RPM, 1,500 OCR RPM
Tier 4	$500+	30+ days	20,000 LLM RPM, 5,000 OCR RPM
Enterprise	Custom	Contract	100,000+ LLM RPM

Rate Limits by Service

Service	Free	Tier 1	Tier 2	Tier 3	Tier 4	Enterprise
LLMs	100	600	2,000	5,000	20,000	100,000
OCR	50	200	600	1,500	5,000	50,000
Voice	50	300	1,000	3,000	10,000	50,000
Search	200	1,000	3,000	10,000	50,000	200,000
Vaults	200	600	2,000	5,000	20,000	100,000
Convert	30	100	300	1,000	3,000	20,000
Orbit Compute	50	200	600	2,000	5,000	50,000
Workflows	50	200	600	2,000	5,000	50,000

All values are requests per minute (RPM)

Rate Limit Errors

When you exceed your rate limit, you’ll receive a 429 Too Many Requests response:

{
  "error": {
    "message": "Rate limit exceeded. You have made too many requests to the llm API. Please retry after 45 seconds.",
    "type": "rate_limit_error",
    "code": "rate_limit_exceeded"
  },
  "tier": "tier_1",
  "limit": 600,
  "reset_at": "2024-01-15T12:00:45Z"
}

Handling Rate Limits

import Casedev from 'casedev';

const client = new Casedev({ apiKey: 'sk_case_...' });

try {
  const response = await client.llm.v1.chat.createCompletion({
    model: 'anthropic/claude-3-5-sonnet-20241022',
    messages: [{ role: 'user', content: 'Hello!' }]
  });
} catch (error) {
  if (error.status === 429) {
    // Wait and retry
    const resetTime = error.headers['x-ratelimit-reset-requests'];
    console.log(`Rate limited. Retry after ${resetTime}`);
  }
}

All our SDKs (TypeScript, Python, Go) and the CLI automatically handle rate limits with exponential backoff retries. See the SDKs page for details.

Prepaid Credits

Case.dev uses a prepaid credit system. You must have a positive credit balance to make API calls.

Credit Balance

Add Credits - Purchase credits from the billing dashboard
Auto Top-Up - Configure automatic top-up when balance drops below a threshold
Balance Check - Every API call checks your balance (cached, sub-millisecond)

Insufficient Balance

When your credit balance is zero, you’ll receive a 402 Payment Required response:

{
  "error": {
    "message": "Insufficient credit balance. Please add credits to continue using the API.",
    "type": "insufficient_balance",
    "code": "payment_required",
    "balance_cents": 0
  }
}

Auto Top-Up

Enable automatic credit purchases when your balance drops low:

Go to Billing → Prepaid Credits
Toggle Auto Top-Up on
Set your threshold (e.g., $10)
Set your top-up amount (e.g., $50)

When your balance drops below

10, we'll automatically charge your card for

50.

Best Practices

1. Implement Retries with Backoff

async function withRetry(fn, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await fn();
    } catch (error) {
      if (error.status === 429 && i < maxRetries - 1) {
        const delay = Math.pow(2, i) * 1000; // 1s, 2s, 4s
        await new Promise(r => setTimeout(r, delay));
        continue;
      }
      throw error;
    }
  }
}

2. Monitor Your Usage

Check rate limit headers on every response to monitor your usage:

const response = await fetch('https://api.case.dev/llm/v1/chat/completions', {
  method: 'POST',
  headers: { 'Authorization': 'Bearer sk_case_...' },
  body: JSON.stringify({ ... })
});

const remaining = response.headers.get('x-ratelimit-remaining-requests');
if (parseInt(remaining) < 10) {
  console.warn('Approaching rate limit!');
}

3. Use Batch Operations

When processing many items, use batch endpoints where available to reduce request count.

4. Cache Results

Cache API responses when appropriate to reduce duplicate requests.

Increasing Your Limits

Automatic Tier Upgrades

Your tier upgrades automatically as you:

Add more credits
Maintain account history

Enterprise Plans

Need higher limits? Contact us for:

Custom rate limits (100K+ RPM)
Dedicated infrastructure
SLA guarantees
Priority support

FAQ

How are rate limits counted?

Rate limits are counted per organization, per service. A request to /llm/v1/chat/completions counts against your LLM limit, while /ocr/v1/process counts against OCR.

Do rate limits apply to webhooks?

No, incoming webhooks to /billing/stripe/webhook and similar endpoints are not rate limited.

What happens if I hit a rate limit?

You’ll receive a 429 response with retry information. Our SDKs handle this automatically with exponential backoff.

Can I see my current tier?

Yes, check the x-ratelimit-tier header on any API response, or visit the billing dashboard.

How quickly do tier upgrades take effect?

Tier upgrades are calculated in real-time. As soon as your lifetime spend crosses a threshold, your new limits apply.

Next Steps

Services Catalog

Explore all available services

Billing Dashboard

Manage your credits and view usage

Cookbooks

See real-world integration examples

Get Started

Platform

Resources

Rate Limits & Usage

Rate Limits & Usage

How Rate Limits Work

Rate Limit Headers

Tier System

Rate Limits by Service

Rate Limit Errors

Handling Rate Limits

Prepaid Credits

Credit Balance

Insufficient Balance

Auto Top-Up

Best Practices

1. Implement Retries with Backoff

2. Monitor Your Usage

3. Use Batch Operations

4. Cache Results

Increasing Your Limits

Automatic Tier Upgrades

Enterprise Plans

FAQ

How are rate limits counted?

Do rate limits apply to webhooks?

What happens if I hit a rate limit?

Can I see my current tier?

How quickly do tier upgrades take effect?

Next Steps

Services Catalog

Billing Dashboard

Cookbooks

Get Started

Platform

Resources

​Rate Limits & Usage

​How Rate Limits Work

​Rate Limit Headers

​Tier System

​Rate Limits by Service

​Rate Limit Errors

​Handling Rate Limits

​Prepaid Credits

​Credit Balance

​Insufficient Balance

​Auto Top-Up

​Best Practices

​1. Implement Retries with Backoff

​2. Monitor Your Usage

​3. Use Batch Operations

​4. Cache Results

​Increasing Your Limits

​Automatic Tier Upgrades

​Enterprise Plans

​FAQ

​How are rate limits counted?

​Do rate limits apply to webhooks?

​What happens if I hit a rate limit?

​Can I see my current tier?

​How quickly do tier upgrades take effect?

​Next Steps

Services Catalog

Billing Dashboard

Cookbooks

Rate Limits & Usage

How Rate Limits Work

Rate Limit Headers

Tier System

Rate Limits by Service

Rate Limit Errors

Handling Rate Limits

Prepaid Credits

Credit Balance

Insufficient Balance

Auto Top-Up

Best Practices

1. Implement Retries with Backoff

2. Monitor Your Usage

3. Use Batch Operations

4. Cache Results

Increasing Your Limits

Automatic Tier Upgrades

Enterprise Plans

FAQ

How are rate limits counted?

Do rate limits apply to webhooks?

What happens if I hit a rate limit?

Can I see my current tier?

How quickly do tier upgrades take effect?

Next Steps