Documentation

API Manual in the NexoRouter documentation.

API Manual

NexoRouter exposes an OpenAI-compatible API. Existing OpenAI SDK code usually needs only three changes: API key, base URL, and model ID.

Core values

Base URL: https://api.nexorouter.com/v1
Authorization: Bearer YOUR_NEXOROUTER_API_KEY
Model ID: copy from Models

Authentication

Every public API request needs an Authorization header:

Authorization: Bearer YOUR_NEXOROUTER_API_KEY

If you see invalid_api_key, check:

  • The header exists.
  • Bearer is followed by one space.
  • The key is a NexoRouter key, not a key from another provider.
  • The key is enabled.
  • The key has not expired.
  • The copied value is complete.

Chat Completions

Endpoint:

POST /v1/chat/completions

Required fields:

  • model
  • messages

Common optional fields:

  • temperature, number from 0 to 2
  • top_p, number from 0 to 1
  • max_tokens, positive integer

Example body:

{
  "model": "deepseek-v4-flash",
  "messages": [
    { "role": "system", "content": "You are a concise assistant." },
    { "role": "user", "content": "Summarize this product in one sentence." }
  ],
  "temperature": 0.7,
  "max_tokens": 256
}

The gateway rejects malformed chat bodies before forwarding them. Empty messages, invalid max_tokens, out-of-range temperature, and out-of-range top_p return invalid_request.

Models

Endpoint:

GET /v1/models

Use it to:

  • Confirm the API key is accepted.
  • List publicly available model IDs.
  • Check model ID spelling.
curl https://api.nexorouter.com/v1/models \
  -H "Authorization: Bearer $NEXOROUTER_API_KEY"

Only model IDs returned by the public models API should be used for new integrations.

Billing behavior

NexoRouter uses prepaid balance. Successful requests consume quota according to the selected model, input tokens, and output tokens.

Current public rules:

  • Workspace balance is shared by API keys in the account.
  • 1 USD = 500000 quota.
  • A key can use workspace balance or a hard key budget.
  • A key can be unrestricted, limited to chat models, or limited to specific model IDs.
  • Usage Logs show model, key, tokens, cost, latency, status, request ID, and error content.
  • When balance or key budget is insufficient, the API returns insufficient_quota.

Rate limits and request size

The gateway has configurable RPM and TPM limits. Current defaults are:

LimitDefault
Requests per minute120
Estimated tokens per minute120000

These values can be changed by deployment configuration and may differ for production capacity planning.

Large requests can fail in two ways:

  • request_too_large: the single request is larger than the per-request token capacity. Reduce the input; retrying the same request will not work.
  • token_rate_limit_exceeded: the current key or IP, optionally combined with the model, has exceeded the TPM window. Wait for the retry window or reduce traffic.

Timeouts

The gateway allows slow upstream models but still has a request ceiling. Public timeout-related codes include:

CodeMeaning
upstream_request_timeoutThe overall upstream request timed out.
upstream_headers_timeoutThe provider did not start a response in time.
upstream_body_timeoutThe provider response body stalled.
upstream_unreachableThe gateway could not reach the upstream provider.

For standard chat models, start with a client timeout of at least 60 seconds. For slow reasoning-style models, use up to 180 seconds before adding retries.

Error shape

Errors use an OpenAI-style JSON shape:

{
  "error": {
    "message": "Insufficient prepaid balance.",
    "type": "invalid_request_error",
    "code": "insufficient_quota"
  }
}

High-frequency codes:

HTTPCodeWhat to do
400invalid_requestFix JSON, messages, max_tokens, temperature, or top_p.
variesinvalid_api_keyCheck the key and Authorization header.
variesinsufficient_quotaAdd balance or create a replacement key with the right budget.
404model_not_foundCopy the model ID from Models and check key model scope.
413request_too_largeReduce input size.
429rate_limit_exceededWait or reduce request rate.
429token_rate_limit_exceededWait or reduce estimated token volume.
502upstream_unreachableRetry later or try another model.
504upstream_request_timeoutIncrease client timeout or try another model.
variesupstream_unavailableProvider or channel is temporarily unavailable. Retry later or switch model.
variesgateway_errorCheck the key, balance, model ID, request format, and Status.
5xxupstream_errorCheck Status, retry, or switch model.
API Manual — NexoRouter