Documentation
API Manual in the NexoRouter documentation.
API Manual
NexoRouter exposes an OpenAI-compatible API. Existing OpenAI SDK code usually needs only three changes: API key, base URL, and model ID.
Core values
Base URL: https://api.nexorouter.com/v1
Authorization: Bearer YOUR_NEXOROUTER_API_KEY
Model ID: copy from Models
Authentication
Every public API request needs an Authorization header:
Authorization: Bearer YOUR_NEXOROUTER_API_KEY
If you see invalid_api_key, check:
- The header exists.
Beareris followed by one space.- The key is a NexoRouter key, not a key from another provider.
- The key is enabled.
- The key has not expired.
- The copied value is complete.
Chat Completions
Endpoint:
POST /v1/chat/completions
Required fields:
modelmessages
Common optional fields:
temperature, number from0to2top_p, number from0to1max_tokens, positive integer
Example body:
{
"model": "deepseek-v4-flash",
"messages": [
{ "role": "system", "content": "You are a concise assistant." },
{ "role": "user", "content": "Summarize this product in one sentence." }
],
"temperature": 0.7,
"max_tokens": 256
}
The gateway rejects malformed chat bodies before forwarding them. Empty messages, invalid max_tokens, out-of-range temperature, and out-of-range top_p return invalid_request.
Models
Endpoint:
GET /v1/models
Use it to:
- Confirm the API key is accepted.
- List publicly available model IDs.
- Check model ID spelling.
curl https://api.nexorouter.com/v1/models \
-H "Authorization: Bearer $NEXOROUTER_API_KEY"
Only model IDs returned by the public models API should be used for new integrations.
Billing behavior
NexoRouter uses prepaid balance. Successful requests consume quota according to the selected model, input tokens, and output tokens.
Current public rules:
- Workspace balance is shared by API keys in the account.
1 USD = 500000 quota.- A key can use workspace balance or a hard key budget.
- A key can be unrestricted, limited to chat models, or limited to specific model IDs.
- Usage Logs show model, key, tokens, cost, latency, status, request ID, and error content.
- When balance or key budget is insufficient, the API returns
insufficient_quota.
Rate limits and request size
The gateway has configurable RPM and TPM limits. Current defaults are:
| Limit | Default |
|---|---|
| Requests per minute | 120 |
| Estimated tokens per minute | 120000 |
These values can be changed by deployment configuration and may differ for production capacity planning.
Large requests can fail in two ways:
request_too_large: the single request is larger than the per-request token capacity. Reduce the input; retrying the same request will not work.token_rate_limit_exceeded: the current key or IP, optionally combined with the model, has exceeded the TPM window. Wait for the retry window or reduce traffic.
Timeouts
The gateway allows slow upstream models but still has a request ceiling. Public timeout-related codes include:
| Code | Meaning |
|---|---|
upstream_request_timeout | The overall upstream request timed out. |
upstream_headers_timeout | The provider did not start a response in time. |
upstream_body_timeout | The provider response body stalled. |
upstream_unreachable | The gateway could not reach the upstream provider. |
For standard chat models, start with a client timeout of at least 60 seconds. For slow reasoning-style models, use up to 180 seconds before adding retries.
Error shape
Errors use an OpenAI-style JSON shape:
{
"error": {
"message": "Insufficient prepaid balance.",
"type": "invalid_request_error",
"code": "insufficient_quota"
}
}
High-frequency codes:
| HTTP | Code | What to do |
|---|---|---|
| 400 | invalid_request | Fix JSON, messages, max_tokens, temperature, or top_p. |
| varies | invalid_api_key | Check the key and Authorization header. |
| varies | insufficient_quota | Add balance or create a replacement key with the right budget. |
| 404 | model_not_found | Copy the model ID from Models and check key model scope. |
| 413 | request_too_large | Reduce input size. |
| 429 | rate_limit_exceeded | Wait or reduce request rate. |
| 429 | token_rate_limit_exceeded | Wait or reduce estimated token volume. |
| 502 | upstream_unreachable | Retry later or try another model. |
| 504 | upstream_request_timeout | Increase client timeout or try another model. |
| varies | upstream_unavailable | Provider or channel is temporarily unavailable. Retry later or switch model. |
| varies | gateway_error | Check the key, balance, model ID, request format, and Status. |
| 5xx | upstream_error | Check Status, retry, or switch model. |