Skip to content

SandBase Models

Access 1400+ AI models from 50+ providers through one unified API. No vendor lock-in, automatic failover, and pay-per-token pricing.

Try it now

Test any model instantly in the Playground — no code required.

How it works

Your App          SandBase           Providers
   │                 │                  │
   │  POST /v1/chat  │                  │
   │  model: gpt-4o  │                  │
   │────────────────>│                  │
   │                 │  Route request   │
   │                 │────────────────>│ A (503 ✗)
   │                 │  Auto-fallback   │
   │                 │────────────────>│ B (200 ✓)
   │                 │<────────────────│
   │  200 + response │                  │
   │<────────────────│                  │

Why use SandBase for models?

FeatureDirect ProviderSandBase
Models available1 provider's models1400+ from 50+ providers
FailoverManual implementationAutomatic, zero-config
API formatProvider-specificOpenAI + Anthropic compatible
BillingPer-provider accountsSingle balance, unified billing
Cost trackingBuild yourselfPer-request cost breakdown
Rate limitsPer-providerUnified, higher limits

Supported providers

ProviderModelsHighlights
OpenAIGPT-4o, o3, o3-miniVision, reasoning, JSON mode
AnthropicClaude Sonnet 4, Haiku 3.5200K context, extended thinking
GoogleGemini 2.5 Pro/Flash1M context window
DeepSeekV3, ReasonerUltra-low cost, reasoning
MetaLlama 4 MaverickOpen-weight, fast
AlibabaQwen3 32BMultilingual, budget
xAIGrokReal-time knowledge
MistralLarge, MediumEuropean, fast
+ 40 more...Browse all →

Quick start

Using OpenAI SDK

python
from openai import OpenAI

client = OpenAI(
    api_key="sk-sb-YOUR_KEY",
    base_url="https://api.sandbase.ai/v1"
)

# Use any of 1400+ models — just change the model name
response = client.chat.completions.create(
    model="gpt-4o",  # or "claude-sonnet-4", "deepseek-chat", "gemini-2.5-pro"...
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
typescript
import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'sk-sb-YOUR_KEY',
  baseURL: 'https://api.sandbase.ai/v1',
});

const response = await client.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Hello!' }],
});
console.log(response.choices[0].message.content);
bash
curl https://api.sandbase.ai/v1/chat/completions \
  -H "Authorization: Bearer sk-sb-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello!"}]}'

Using Anthropic SDK

python
from anthropic import Anthropic

client = Anthropic(
    api_key="sk-sb-YOUR_KEY",
    base_url="https://api.sandbase.ai"
)

message = client.messages.create(
    model="claude-sonnet-4",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}]
)
print(message.content[0].text)

Model routing

SandBase automatically routes your requests to the best available provider:

                    ┌─────────────────────────────────┐
                    │        Routing Strategy          │
                    ├─────────────────────────────────┤
                    │                                 │
  Request ────────> │  Cost ──────> Cheapest provider │ ────> Response
                    │  Latency ───> Fastest provider  │
                    │  Availability > Healthiest      │
                    │  Weight ────> Weighted random   │
                    │                                 │
                    └─────────────────────────────────┘

Smart Routing

If a provider is down, SandBase automatically retries on the next available provider. Your app never sees a 503 — unless ALL providers for that model are down.

Model capabilities

Not all models support all features. SandBase automatically validates your request:

CapabilityDescriptionExample models
ChatBasic conversationAll models
StreamingToken-by-token deliveryAll models
ToolsFunction callingGPT-4o, Claude, Gemini
VisionImage understandingGPT-4o, Claude, Gemini, Llama 4
ThinkingExtended reasoningo3, Claude Sonnet 4, Gemini 2.5
JSON ModeGuaranteed JSON outputGPT-4o, Gemini, DeepSeek
CachingPrompt cache discountGPT-4o, Claude, DeepSeek

Full capability matrix

Pricing

Pay only for tokens used. No subscriptions, no minimums.

BudgetModelsCost per 1K requests
💰 Ultra-lowQwen3 32B, GPT-4o-mini$0.01–$0.05
💰💰 LowDeepSeek V3, Gemini Flash$0.05–$0.20
💰💰💰 MediumClaude Haiku, GPT-4o$0.20–$2.00
💰💰💰💰 HighClaude Sonnet 4, o3$2.00–$20.00

Free tier

New accounts get $5 in free credits. No credit card required.

Next steps