Documentation
Rate Limits in the NexoRouter documentation.
Rate Limits
Status: Stable gateway guardrail.
NexoRouter can reject requests before they reach a provider when traffic exceeds request-per-minute or token-per-minute limits.
Error codes
| Code | Meaning | Fix |
|---|---|---|
rate_limit_exceeded | Too many requests in the current window. | Wait for retry, lower concurrency, or queue work. |
token_rate_limit_exceeded | Too many estimated input tokens in the current window. | Reduce prompt size or spread work over time. |
request_too_large | One request is larger than the per-minute token budget. | Split or shorten the request; retrying unchanged will fail. |
Client behavior
Use retry-after when it is present. Do not retry immediately in a tight loop.
Production checklist
- Add bounded retries with backoff.
- Limit per-user concurrency.
- Keep prompts compact.
- Use a cheaper or faster model for background jobs.
- Watch Usage Logs for repeated 429s.