Budget Controls

NemoRouter provides granular budget and rate controls to prevent runaway costs and enforce usage policies. Set spending limits and rate controls at the organization, team, or individual key level.

Dashboard screenshot pending

Replace with dashboard-budgets.jpg showing /[organization]/budgets — list of budgets with current spend vs limit and a Create Budget action.

Budget Types

NemoRouter supports two types of controls:

Control Type	What It Limits	Units
Spending Budgets	Total dollar spend over a time period	USD ($)
Rate Limits	Request and token throughput	RPM (requests/min), TPM (tokens/min)

Both can be applied at multiple scopes and work together. A request is blocked if either the budget or rate limit is exceeded.

Spending Budgets

Spending budgets cap how much a key or team can spend over a defined period.

Creating a Budget

Navigate to the Budgets page in the dashboard
Click Create Budget
Configure the budget:

Field	Description	Example
Budget Name	Descriptive label	"Backend API — Monthly"
Max Spend	Maximum dollar amount	$500.00
Duration	Budget reset period	Monthly, Weekly, Daily, or Total

Assign the budget to a key or team

Budget Durations

Duration	Behavior
Daily	Resets at midnight UTC each day
Weekly	Resets at midnight UTC each Monday
Monthly	Resets on the 1st of each month at midnight UTC
Total	Never resets — a hard lifetime cap

Budget Scope

Budgets can be scoped to different levels:

Scope	Description	Use Case
API Key	Limits spend for a single key	Per-environment or per-service limits
Team	Limits spend for all keys in a team	Department-level budgets

When a budget is exceeded, all requests using the associated key or team keys return a 402 Payment Required error:

{
  "error": {
    "message": "Budget exceeded. Current spend: $500.12 / $500.00 limit.",
    "type": "budget_error",
    "code": "budget_exceeded"
  }
}

Budget Examples

Limit a staging key to $50/month:

Budget Name: "Staging Monthly"
Max Spend: $50.00
Duration: Monthly
Assign to: staging-api-key

Cap the data science team at $2,000/month:

Budget Name: "Data Science Monthly"
Max Spend: $2,000.00
Duration: Monthly
Assign to: Team "Data Science"

Hard cap for a proof-of-concept:

Budget Name: "POC Total Budget"
Max Spend: $100.00
Duration: Total (never resets)
Assign to: poc-demo-key

Rate Limits

Rate limits control the throughput of requests and tokens per minute.

Rate Limit Types

Limit	Description	Enforcement
RPM (Requests Per Minute)	Maximum number of API requests per minute	Returns `429` when exceeded
TPM (Tokens Per Minute)	Maximum number of tokens processed per minute	Returns `429` when exceeded

How Rate Limits Work

Rate limits use a sliding window. When a limit is exceeded, the API returns a 429 Too Many Requests response with a Retry-After header:

{
  "error": {
    "message": "Rate limit exceeded. Retry after 12 seconds.",
    "type": "rate_limit_error",
    "code": "rate_limit_exceeded"
  }
}

Rate Limit Scope

Rate limits can be set at the key level. Higher-tier plans include higher default rate limits:

Tier	Default RPM	Default TPM
Pay As You Go	Standard	Standard
Tier 2	Enhanced	Enhanced
Tier 3	Premium	Premium
Enterprise	Custom SLAs	Custom SLAs

Custom rate limits set on a specific key override the tier defaults.

Handling Rate Limits in Code

Implement exponential backoff when you receive a 429 response:

import time
from openai import OpenAI, RateLimitError

client = OpenAI(
    api_key="sk-nemo-your-key-here",
    base_url="https://api.nemorouter.ai/v1",
)

def call_with_retry(messages, max_retries=3):
    for attempt in range(max_retries):
        try:
            return client.chat.completions.create(
                model="gpt-4o",
                messages=messages,
            )
        except RateLimitError as e:
            if attempt == max_retries - 1:
                raise
            wait_time = 2 ** attempt  # 1s, 2s, 4s
            print(f"Rate limited. Retrying in {wait_time}s...")
            time.sleep(wait_time)

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.NEMOROUTER_API_KEY,
  baseURL: "https://api.nemorouter.ai/v1",
  maxRetries: 3, // The OpenAI SDK handles retries automatically
});

Credits and Budgets

It's important to understand how credits and budgets interact:

Credits are your account's payment balance. When you buy $100 in credits, you have $100 to spend on API calls.
Budgets are spending limits that restrict how much of those credits a specific key or team can consume.

For example:

Your organization has $1,000 in credits
Team A has a $300/month budget
Team B has a $500/month budget
$200 in credits remains unbudgeted (available to keys without budgets)

Budgets do not reserve credits — they only cap spend. If Team A spends $300 and Team B spends $500, the remaining $200 is available to any key.

Monitoring Budgets

Dashboard screenshot pending

Replace with dashboard-analytics.jpg showing /[organization]/analytics — spend over time with breakdowns by key, team, and model.

Track budget usage from the dashboard:

Budgets page — View all budgets with current spend vs. limit
Analytics page — Drill into spend by key, team, model, or time period
API Keys page — See per-key spend at a glance

Best Practices

Set budgets on all production keys — Even generous ones. They're a safety net against bugs that make runaway API calls.
Use daily budgets for development — Catch issues early with $10-20/day limits on dev keys.
Use total budgets for POCs — Hard-cap spend on proof-of-concept projects.
Monitor the Analytics page weekly — Look for unexpected spend spikes before they become expensive.
Separate high-volume and low-volume keys — Don't let a batch pipeline share a key with an interactive chat interface.

Next Steps

Organization Setup — Configure your organization
Team Management — Create teams and assign budgets
Authentication — Create and manage API keys

Budget Controls

On this page