Documentation

Rate Limits in the NexoRouter documentation.

Rate Limits

Status: Stable gateway guardrail.

NexoRouter can reject requests before they reach a provider when traffic exceeds request-per-minute or token-per-minute limits.

Error codes

CodeMeaningFix
rate_limit_exceededToo many requests in the current window.Wait for retry, lower concurrency, or queue work.
token_rate_limit_exceededToo many estimated input tokens in the current window.Reduce prompt size or spread work over time.
request_too_largeOne request is larger than the per-minute token budget.Split or shorten the request; retrying unchanged will fail.

Client behavior

Use retry-after when it is present. Do not retry immediately in a tight loop.

Production checklist

  • Add bounded retries with backoff.
  • Limit per-user concurrency.
  • Keep prompts compact.
  • Use a cheaper or faster model for background jobs.
  • Watch Usage Logs for repeated 429s.
Rate Limits — NexoRouter