Skip to main content

Rate Limits & Usage

Case.dev uses a tiered rate limiting system inspired by OpenAI. Your rate limits automatically increase as you use the platform and build trust.

How Rate Limits Work

  • Requests Per Minute (RPM) - Each service has its own rate limit
  • Per-Organization - Limits apply to your entire organization, not individual API keys
  • Tier-Based - Your tier determines your limits (see below)
  • Automatic Upgrades - Tiers increase automatically based on spend and account age

Rate Limit Headers

Every API response includes rate limit information:
x-ratelimit-limit-requests: 600
x-ratelimit-remaining-requests: 599
x-ratelimit-reset-requests: 45s
x-ratelimit-tier: tier_1
HeaderDescription
x-ratelimit-limit-requestsMaximum requests per minute for this service
x-ratelimit-remaining-requestsRequests remaining in current window
x-ratelimit-reset-requestsTime until limit resets
x-ratelimit-tierYour current tier

Tier System

Your tier is determined by lifetime spend and account age:
TierSpend RequiredAccount AgeExample Limits
Free$0Any100 LLM RPM, 50 OCR RPM
Tier 1$10+Any600 LLM RPM, 200 OCR RPM
Tier 2$50+7+ days2,000 LLM RPM, 600 OCR RPM
Tier 3$100+14+ days5,000 LLM RPM, 1,500 OCR RPM
Tier 4$500+30+ days20,000 LLM RPM, 5,000 OCR RPM
EnterpriseCustomContract100,000+ LLM RPM

Rate Limits by Service

ServiceFreeTier 1Tier 2Tier 3Tier 4Enterprise
LLMs1006002,0005,00020,000100,000
OCR502006001,5005,00050,000
Voice503001,0003,00010,00050,000
Search2001,0003,00010,00050,000200,000
Vaults2006002,0005,00020,000100,000
Convert301003001,0003,00020,000
Compute502006002,0005,00050,000
Workflows502006002,0005,00050,000
All values are requests per minute (RPM)

Rate Limit Errors

When you exceed your rate limit, you’ll receive a 429 Too Many Requests response:
{
  "error": {
    "message": "Rate limit exceeded. You have made too many requests to the llm API. Please retry after 45 seconds.",
    "type": "rate_limit_error",
    "code": "rate_limit_exceeded"
  },
  "tier": "tier_1",
  "limit": 600,
  "reset_at": "2024-01-15T12:00:45Z"
}

Handling Rate Limits

import Casedev from 'casedev';

const client = new Casedev({ apiKey: 'sk_case_...' });

try {
  const response = await client.llm.v1.chat.createCompletion({
    model: 'anthropic/claude-3-5-sonnet-20241022',
    messages: [{ role: 'user', content: 'Hello!' }]
  });
} catch (error) {
  if (error.status === 429) {
    // Wait and retry
    const resetTime = error.headers['x-ratelimit-reset-requests'];
    console.log(`Rate limited. Retry after ${resetTime}`);
  }
}
Our SDKs automatically handle rate limits with exponential backoff retries.

Prepaid Credits

Case.dev uses a prepaid credit system. You must have a positive credit balance to make API calls.

Credit Balance

  • Add Credits - Purchase credits from the billing dashboard
  • Auto Top-Up - Configure automatic top-up when balance drops below a threshold
  • Balance Check - Every API call checks your balance (cached, sub-millisecond)

Insufficient Balance

When your credit balance is zero, you’ll receive a 402 Payment Required response:
{
  "error": {
    "message": "Insufficient credit balance. Please add credits to continue using the API.",
    "type": "insufficient_balance",
    "code": "payment_required",
    "balance_cents": 0
  }
}

Auto Top-Up

Enable automatic credit purchases when your balance drops low:
  1. Go to Billing → Prepaid Credits
  2. Toggle Auto Top-Up on
  3. Set your threshold (e.g., $10)
  4. Set your top-up amount (e.g., $50)
When your balance drops below 10,wellautomaticallychargeyourcardfor10, we'll automatically charge your card for 50.

Best Practices

1. Implement Retries with Backoff

async function withRetry(fn, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await fn();
    } catch (error) {
      if (error.status === 429 && i < maxRetries - 1) {
        const delay = Math.pow(2, i) * 1000; // 1s, 2s, 4s
        await new Promise(r => setTimeout(r, delay));
        continue;
      }
      throw error;
    }
  }
}

2. Monitor Your Usage

Check rate limit headers on every response to monitor your usage:
const response = await fetch('https://api.case.dev/llm/v1/chat/completions', {
  method: 'POST',
  headers: { 'Authorization': 'Bearer sk_case_...' },
  body: JSON.stringify({ ... })
});

const remaining = response.headers.get('x-ratelimit-remaining-requests');
if (parseInt(remaining) < 10) {
  console.warn('Approaching rate limit!');
}

3. Use Batch Operations

When processing many items, use batch endpoints where available to reduce request count.

4. Cache Results

Cache API responses when appropriate to reduce duplicate requests.

Increasing Your Limits

Automatic Tier Upgrades

Your tier upgrades automatically as you:
  • Add more credits
  • Maintain account history

Enterprise Plans

Need higher limits? Contact us for:
  • Custom rate limits (100K+ RPM)
  • Dedicated infrastructure
  • SLA guarantees
  • Priority support

FAQ

How are rate limits counted?

Rate limits are counted per organization, per service. A request to /llm/v1/chat/completions counts against your LLM limit, while /ocr/v1/process counts against OCR.

Do rate limits apply to webhooks?

No, incoming webhooks to /billing/stripe/webhook and similar endpoints are not rate limited.

What happens if I hit a rate limit?

You’ll receive a 429 response with retry information. Our SDKs handle this automatically with exponential backoff.

Can I see my current tier?

Yes, check the x-ratelimit-tier header on any API response, or visit the billing dashboard.

How quickly do tier upgrades take effect?

Tier upgrades are calculated in real-time. As soon as your lifetime spend crosses a threshold, your new limits apply.

Next Steps