API Manual

NexoRouter exposes an OpenAI-compatible API. Existing OpenAI SDK code usually needs only three changes: API key, base URL, and model ID.

Core values

Base URL: https://api.nexorouter.com/v1
Authorization: Bearer YOUR_NEXOROUTER_API_KEY
Model ID: copy from Models

Authentication

Every public API request needs an Authorization header:

Authorization: Bearer YOUR_NEXOROUTER_API_KEY

If you see invalid_api_key, check:

The header exists.
Bearer is followed by one space.
The key is a NexoRouter key, not a key from another provider.
The key is enabled.
The key has not expired.
The copied value is complete.

Chat Completions

Endpoint:

POST /v1/chat/completions

Required fields:

model
messages

Common optional fields:

temperature, number from 0 to 2
top_p, number from 0 to 1
max_tokens, positive integer

Example body:

{
  "model": "deepseek-v4-flash",
  "messages": [
    { "role": "system", "content": "You are a concise assistant." },
    { "role": "user", "content": "Summarize this product in one sentence." }
  ],
  "temperature": 0.7,
  "max_tokens": 256
}

The gateway rejects malformed chat bodies before forwarding them. Empty messages, invalid max_tokens, out-of-range temperature, and out-of-range top_p return invalid_request.

Models

Endpoint:

GET /v1/models

Use it to:

Confirm the API key is accepted.
List publicly available model IDs.
Check model ID spelling.

curl https://api.nexorouter.com/v1/models \
  -H "Authorization: Bearer $NEXOROUTER_API_KEY"

Only model IDs returned by the public models API should be used for new integrations.

Billing behavior

NexoRouter uses prepaid balance. Successful requests consume quota according to the selected model, input tokens, and output tokens.

Current public rules:

Workspace balance is shared by API keys in the account.
1 USD = 500000 quota.
A key can use workspace balance or a hard key budget.
A key can be unrestricted, limited to chat models, or limited to specific model IDs.
Usage Logs show model, key, tokens, cost, latency, status, request ID, and error content.
When balance or key budget is insufficient, the API returns insufficient_quota.

Rate limits and request size

The gateway has configurable RPM and TPM limits. Current defaults are:

Limit	Default
Requests per minute	`120`
Estimated tokens per minute	`120000`

These values can be changed by deployment configuration and may differ for production capacity planning.

Large requests can fail in two ways:

request_too_large: the single request is larger than the per-request token capacity. Reduce the input; retrying the same request will not work.
token_rate_limit_exceeded: the current key or IP, optionally combined with the model, has exceeded the TPM window. Wait for the retry window or reduce traffic.

Timeouts

The gateway allows slow upstream models but still has a request ceiling. Public timeout-related codes include:

Code	Meaning
`upstream_request_timeout`	The overall upstream request timed out.
`upstream_headers_timeout`	The provider did not start a response in time.
`upstream_body_timeout`	The provider response body stalled.
`upstream_unreachable`	The gateway could not reach the upstream provider.

For standard chat models, start with a client timeout of at least 60 seconds. For slow reasoning-style models, use up to 180 seconds before adding retries.

Error shape

Errors use an OpenAI-style JSON shape:

{
  "error": {
    "message": "Insufficient prepaid balance.",
    "type": "invalid_request_error",
    "code": "insufficient_quota"
  }
}

High-frequency codes:

HTTP	Code	What to do
400	`invalid_request`	Fix JSON, `messages`, `max_tokens`, `temperature`, or `top_p`.
varies	`invalid_api_key`	Check the key and `Authorization` header.
varies	`insufficient_quota`	Add balance or create a replacement key with the right budget.
404	`model_not_found`	Copy the model ID from Models and check key model scope.
413	`request_too_large`	Reduce input size.
429	`rate_limit_exceeded`	Wait or reduce request rate.
429	`token_rate_limit_exceeded`	Wait or reduce estimated token volume.
502	`upstream_unreachable`	Retry later or try another model.
504	`upstream_request_timeout`	Increase client timeout or try another model.
varies	`upstream_unavailable`	Provider or channel is temporarily unavailable. Retry later or switch model.
varies	`gateway_error`	Check the key, balance, model ID, request format, and Status.
5xx	`upstream_error`	Check Status, retry, or switch model.