Rate Limits & Usage
Case.dev uses a tiered rate limiting system inspired by OpenAI. Your rate limits automatically increase as you use the platform and build trust.How Rate Limits Work
- Requests Per Minute (RPM) - Each service has its own rate limit
- Per-Organization - Limits apply to your entire organization, not individual API keys
- Tier-Based - Your tier determines your limits (see below)
- Automatic Upgrades - Tiers increase automatically based on spend and account age
Rate Limit Headers
Every API response includes rate limit information:| Header | Description |
|---|---|
x-ratelimit-limit-requests | Maximum requests per minute for this service |
x-ratelimit-remaining-requests | Requests remaining in current window |
x-ratelimit-reset-requests | Time until limit resets |
x-ratelimit-tier | Your current tier |
Tier System
Your tier is determined by lifetime spend and account age:| Tier | Spend Required | Account Age | Example Limits |
|---|---|---|---|
| Free | $0 | Any | 100 LLM RPM, 50 OCR RPM |
| Tier 1 | $10+ | Any | 600 LLM RPM, 200 OCR RPM |
| Tier 2 | $50+ | 7+ days | 2,000 LLM RPM, 600 OCR RPM |
| Tier 3 | $100+ | 14+ days | 5,000 LLM RPM, 1,500 OCR RPM |
| Tier 4 | $500+ | 30+ days | 20,000 LLM RPM, 5,000 OCR RPM |
| Enterprise | Custom | Contract | 100,000+ LLM RPM |
Rate Limits by Service
| Service | Free | Tier 1 | Tier 2 | Tier 3 | Tier 4 | Enterprise |
|---|---|---|---|---|---|---|
| LLMs | 100 | 600 | 2,000 | 5,000 | 20,000 | 100,000 |
| OCR | 50 | 200 | 600 | 1,500 | 5,000 | 50,000 |
| Voice | 50 | 300 | 1,000 | 3,000 | 10,000 | 50,000 |
| Search | 200 | 1,000 | 3,000 | 10,000 | 50,000 | 200,000 |
| Vaults | 200 | 600 | 2,000 | 5,000 | 20,000 | 100,000 |
| Convert | 30 | 100 | 300 | 1,000 | 3,000 | 20,000 |
| Compute | 50 | 200 | 600 | 2,000 | 5,000 | 50,000 |
| Workflows | 50 | 200 | 600 | 2,000 | 5,000 | 50,000 |
Rate Limit Errors
When you exceed your rate limit, you’ll receive a429 Too Many Requests response:
Handling Rate Limits
Prepaid Credits
Case.dev uses a prepaid credit system. You must have a positive credit balance to make API calls.Credit Balance
- Add Credits - Purchase credits from the billing dashboard
- Auto Top-Up - Configure automatic top-up when balance drops below a threshold
- Balance Check - Every API call checks your balance (cached, sub-millisecond)
Insufficient Balance
When your credit balance is zero, you’ll receive a402 Payment Required response:
Auto Top-Up
Enable automatic credit purchases when your balance drops low:- Go to Billing → Prepaid Credits
- Toggle Auto Top-Up on
- Set your threshold (e.g., $10)
- Set your top-up amount (e.g., $50)
Best Practices
1. Implement Retries with Backoff
2. Monitor Your Usage
Check rate limit headers on every response to monitor your usage:3. Use Batch Operations
When processing many items, use batch endpoints where available to reduce request count.4. Cache Results
Cache API responses when appropriate to reduce duplicate requests.Increasing Your Limits
Automatic Tier Upgrades
Your tier upgrades automatically as you:- Add more credits
- Maintain account history
Enterprise Plans
Need higher limits? Contact us for:- Custom rate limits (100K+ RPM)
- Dedicated infrastructure
- SLA guarantees
- Priority support
FAQ
How are rate limits counted?
Rate limits are counted per organization, per service. A request to/llm/v1/chat/completions counts against your LLM limit, while /ocr/v1/process counts against OCR.
Do rate limits apply to webhooks?
No, incoming webhooks to/billing/stripe/webhook and similar endpoints are not rate limited.
What happens if I hit a rate limit?
You’ll receive a429 response with retry information. Our SDKs handle this automatically with exponential backoff.
Can I see my current tier?
Yes, check thex-ratelimit-tier header on any API response, or visit the billing dashboard.