SandBase Models

Access 1400+ AI models from 50+ providers through one unified API. No vendor lock-in, automatic failover, and pay-per-token pricing.

Try it now

Test any model instantly in the Playground — no code required.

How it works

Your App          SandBase           Providers
   │                 │                  │
   │  POST /v1/chat  │                  │
   │  model: gpt-4o  │                  │
   │────────────────>│                  │
   │                 │  Route request   │
   │                 │────────────────>│ A (503 ✗)
   │                 │  Auto-fallback   │
   │                 │────────────────>│ B (200 ✓)
   │                 │<────────────────│
   │  200 + response │                  │
   │<────────────────│                  │

Why use SandBase for models?

Feature	Direct Provider	SandBase
Models available	1 provider's models	1400+ from 50+ providers
Failover	Manual implementation	Automatic, zero-config
API format	Provider-specific	OpenAI + Anthropic compatible
Billing	Per-provider accounts	Single balance, unified billing
Cost tracking	Build yourself	Per-request cost breakdown
Rate limits	Per-provider	Unified, higher limits

Supported providers

Provider	Models	Highlights
OpenAI	GPT-4o, o3, o3-mini	Vision, reasoning, JSON mode
Anthropic	Claude Sonnet 4, Haiku 3.5	200K context, extended thinking
Google	Gemini 2.5 Pro/Flash	1M context window
DeepSeek	V3, Reasoner	Ultra-low cost, reasoning
Meta	Llama 4 Maverick	Open-weight, fast
Alibaba	Qwen3 32B	Multilingual, budget
xAI	Grok	Real-time knowledge
Mistral	Large, Medium	European, fast
+ 40 more	...	Browse all →

Quick start

Using OpenAI SDK

PythonTypeScriptcURL

python

from openai import OpenAI

client = OpenAI(
    api_key="sk-sb-YOUR_KEY",
    base_url="https://api.sandbase.ai/v1"
)

# Use any of 1400+ models — just change the model name
response = client.chat.completions.create(
    model="gpt-4o",  # or "claude-sonnet-4", "deepseek-chat", "gemini-2.5-pro"...
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

typescript

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'sk-sb-YOUR_KEY',
  baseURL: 'https://api.sandbase.ai/v1',
});

const response = await client.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Hello!' }],
});
console.log(response.choices[0].message.content);

bash

curl https://api.sandbase.ai/v1/chat/completions \
  -H "Authorization: Bearer sk-sb-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello!"}]}'

Using Anthropic SDK

python

from anthropic import Anthropic

client = Anthropic(
    api_key="sk-sb-YOUR_KEY",
    base_url="https://api.sandbase.ai"
)

message = client.messages.create(
    model="claude-sonnet-4",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}]
)
print(message.content[0].text)

Model routing

SandBase automatically routes your requests to the best available provider:

                    ┌─────────────────────────────────┐
                    │        Routing Strategy          │
                    ├─────────────────────────────────┤
                    │                                 │
  Request ────────> │  Cost ──────> Cheapest provider │ ────> Response
                    │  Latency ───> Fastest provider  │
                    │  Availability > Healthiest      │
                    │  Weight ────> Weighted random   │
                    │                                 │
                    └─────────────────────────────────┘

Smart Routing

If a provider is down, SandBase automatically retries on the next available provider. Your app never sees a 503 — unless ALL providers for that model are down.

Model capabilities

Not all models support all features. SandBase automatically validates your request:

Capability	Description	Example models
Chat	Basic conversation	All models
Streaming	Token-by-token delivery	All models
Tools	Function calling	GPT-4o, Claude, Gemini
Vision	Image understanding	GPT-4o, Claude, Gemini, Llama 4
Thinking	Extended reasoning	o3, Claude Sonnet 4, Gemini 2.5
JSON Mode	Guaranteed JSON output	GPT-4o, Gemini, DeepSeek
Caching	Prompt cache discount	GPT-4o, Claude, DeepSeek

→ Full capability matrix

Pricing

Pay only for tokens used. No subscriptions, no minimums.

Budget	Models	Cost per 1K requests
💰 Ultra-low	Qwen3 32B, GPT-4o-mini	$0.01–$0.05
💰💰 Low	DeepSeek V3, Gemini Flash	$0.05–$0.20
💰💰💰 Medium	Claude Haiku, GPT-4o	$0.20–$2.00
💰💰💰💰 High	Claude Sonnet 4, o3	$2.00–$20.00

Free tier

New accounts get $5 in free credits. No credit card required.

Next steps

Quickstart — Make your first API call in 5 minutes
Model Routing — Configure routing strategies
Streaming — Implement real-time streaming
Supported Models — Full model list with pricing
Python SDK — OpenAI + Anthropic SDK integration
JavaScript SDK — Node.js and browser usage
API Reference — Chat Completions endpoint docs

SandBase Models ​

How it works ​

Why use SandBase for models? ​

Supported providers ​

Quick start ​

Using OpenAI SDK ​

Using Anthropic SDK ​

Model routing ​

Model capabilities ​

Pricing ​

Next steps ​