Skip to content

Search is only available in production builds. Try building and previewing the site to test it out locally.

Gateway

Beta

Gateway provides OpenAI-compatible endpoints for chat completions, embeddings, and model listing. Point your existing OpenAI SDK—or Reactor’s client—at Gateway and route requests to OpenRouter, Amazon Bedrock, Azure Foundry, or any OpenAI-compatible API.

A built-in model registry ships sensible defaults. Define aliases like reasoning/cheapest to let Gateway pick the best model by cost or availability. Usage events emit to Analytics for token tracking keyed by the authenticated user.

Gateway is in beta: the core API is stable, but provider coverage, quota enforcement, and regional routing continue to expand.

  • OpenAI-compatible API/ai/v1/chat/completions, /ai/v1/embeddings, /ai/v1/models.
  • Multi-provider dispatch — OpenRouter, Bedrock (SigV4), Azure Foundry, generic OpenAI-compatible APIs.
  • Model registry — Built-in defaults with per-project TOML overlays.
  • Routing aliases — Strategies: first, random, round_robin, cheapest, fastest.
  • Streaming — Server-Sent Events with data: {...} chunks and [DONE] terminator.
  • Usage metering — Token counts emitted per request for billing and analytics.
  • Extension seam — Cloud deployments can inject quota checks and regional routing without forking the OSS crate.

List models, send a chat completion, and stream the response.

Terminal window
reactor auth login user@example.com --password '...'
reactor ai models list
reactor ai test gpt-4o-mini --prompt "Explain Reactor.cloud in one sentence"
reactor ai chat claude-3-5-sonnet --prompt "Write a haiku about backends" --stream

Aliases resolve to one or more models using a strategy defined in the registry.

# ai-models.toml (project overlay)
[aliases.reasoning/cheapest]
strategy = "cheapest"
targets = ["gpt-4o-mini", "claude-3-haiku"]
[aliases.reasoning/best]
strategy = "first"
targets = ["gpt-4", "claude-3-5-sonnet"]
Terminal window
reactor ai aliases list
reactor ai chat reasoning/cheapest --prompt "Summarize this API design"
Terminal window
reactor ai embed text-embedding-3-small "Hello world"

Point the official OpenAI client at Gateway:

import OpenAI from 'openai';
const openai = new OpenAI({
baseURL: `${process.env.REACTOR_URL}/ai/v1`,
apiKey: process.env.REACTOR_TOKEN!, // JWT access token
});
const completion = await openai.chat.completions.create({
model: 'gpt-4o-mini',
messages: [{ role: 'user', content: 'Hello!' }],
});
[gateway]
# Optional registry overlay (merges with built-in defaults)
registry_overlay = "./ai-models.toml"
# Or fetch remotely:
# registry_url = "https://example.com/models.toml"
[gateway.providers.openrouter]
enabled = true
# api_key via OPENROUTER_API_KEY env
[gateway.providers.bedrock]
enabled = false
region = "us-east-1"
# AWS credentials via standard env vars
[gateway.providers.foundry]
enabled = false
endpoint = "https://my-resource.openai.azure.com"
# api_key via AZURE_FOUNDRY_API_KEY env

Environment variables:

VariableRequiredDescription
REACTOR_AI_BINDNoDefault 0.0.0.0:8004 (may vary by deploy)
REACTOR_AI_AUTH_URLYes (microservices)Identity service URL
OPENROUTER_API_KEYIf OpenRouter enabledProvider API key
AWS_ACCESS_KEY_IDIf Bedrock enabledAWS credentials
AWS_SECRET_ACCESS_KEYIf Bedrock enabledAWS credentials
AWS_REGIONNoDefault us-east-1
AZURE_FOUNDRY_API_KEYIf Foundry enabledAzure API key
AZURE_FOUNDRY_ENDPOINTIf Foundry enabledAzure resource URL

Reference reactor.toml section name in project config:

[ai]
registry_overlay = "./ai-models.toml"
LimitValueNotes
AuthenticationRequiredNo anonymous inference
StreamingSSEOpenAI-compatible chunk format
Provider errorsPassed throughMapped to provider_error with upstream detail
Quota enforcementCloud onlyOSS ships no-op extensions
Credit ledger / billingCloud onlyConsumes Analytics usage events
Prompt cachingNot in v0Planned via reactor-cache integration
Function callingPartialFull tool support across providers in progress

Built-in alias strategies:

StrategyBehavior
firstUse first available target
randomRandom target per request
round_robinRotate across targets
cheapestLowest input_price_per_mtok + output_price_per_mtok
fastestProvider latency heuristic (beta)

Usage event shape (sent to Analytics):

{
"event": "ai.usage",
"properties": {
"model_id": "gpt-4o-mini",
"user_id": "user_01HZ...",
"tokens_in": 42,
"tokens_out": 128
}
}
MethodPathDescription
GET/ai/v1/healthService health
GET/ai/v1/modelsList available models
POST/ai/v1/chat/completionsChat (streaming or not)
POST/ai/v1/embeddingsGenerate embeddings

The model ID or alias is not in the registry. List models with GET /ai/v1/models or add an overlay in ai-models.toml.

The upstream provider rejected or failed the request. Check provider credentials, rate limits, and model availability in the provider’s dashboard. Error details may include the upstream message.

Terminal window
reactor ai doctor
reactor ai test gpt-4o-mini --prompt "ping"

Missing or expired JWT. Refresh your access token via Identity before calling Gateway.

Some proxies buffer SSE. Connect directly to the API origin or disable buffering for /ai/v1/chat/completions when stream: true. Ensure your client reads until the [DONE] line.

Your organization’s credit balance or rate limit was hit. Check usage in Analytics or your Cloud dashboard. Self-hosted OSS deployments do not enforce quotas by default.