Supported Models
SandBase provides access to 1400+ models from 50+ providers through a single API. All models are available via the OpenAI-compatible endpoint (/v1/chat/completions) and, where applicable, the Anthropic-compatible endpoint (/v1/messages).
TIP
Browse the full model catalog with live pricing at sandbase.ai/models.
Model Comparison Table
| Model | Provider | Input (per 1M tokens) | Output (per 1M tokens) | Context Window | Key Capabilities |
|---|---|---|---|---|---|
openai/gpt-4o | OpenAI | $2.50 | $10.00 | 128K | Chat, Streaming, Tools, Vision, JSON Mode, Cache |
openai/gpt-4o-mini | OpenAI | $0.15 | $0.60 | 128K | Chat, Streaming, Tools, Vision, JSON Mode, Cache |
openai/o3 | OpenAI | $10.00 | $40.00 | 200K | Chat, Streaming, Tools, Vision, Thinking, JSON Mode |
openai/o3-mini | OpenAI | $1.10 | $4.40 | 200K | Chat, Streaming, Tools, Thinking, JSON Mode |
anthropic/claude-sonnet-4 | Anthropic | $3.00 | $15.00 | 200K | Chat, Streaming, Tools, Vision, Thinking, Cache |
anthropic/claude-3.5-haiku | Anthropic | $0.80 | $4.00 | 200K | Chat, Streaming, Tools, Vision, Cache |
deepseek/deepseek-v3 | DeepSeek | $0.27 | $1.10 | 128K | Chat, Streaming, Tools, JSON Mode |
deepseek/deepseek-chat | DeepSeek | $0.27 | $1.10 | 128K | Chat, Streaming, Tools, JSON Mode |
deepseek/deepseek-reasoner | DeepSeek | $0.55 | $2.19 | 128K | Chat, Streaming, Thinking |
google/gemini-2.5-pro | $1.25 | $10.00 | 1M | Chat, Streaming, Tools, Vision, Thinking, JSON Mode, Cache | |
google/gemini-2.5-flash | $0.15 | $0.60 | 1M | Chat, Streaming, Tools, Vision, Thinking, JSON Mode, Cache | |
meta/llama-4-maverick | Meta | $0.20 | $0.60 | 128K | Chat, Streaming, Tools, Vision |
alibaba/qwen3-32b | Alibaba | $0.10 | $0.30 | 32K | Chat, Streaming, Tools |
bytedance/seed-1.6 | ByteDance | $0.50 | $2.00 | 128K | Chat, Streaming, Tools, Vision |
TIP
Prices are in USD and reflect SandBase's pass-through pricing. Actual costs may vary slightly based on provider pricing changes.
Models by Provider
OpenAI
| Model | Best For | Context | Reasoning |
|---|---|---|---|
openai/gpt-4o | General-purpose, multimodal tasks | 128K | No |
openai/gpt-4o-mini | Simple tasks, high-volume workloads | 128K | No |
openai/o3 | Complex reasoning, math, coding | 200K | Yes |
openai/o3-mini | Budget reasoning tasks | 200K | Yes |
Notes:
- o3 and o3-mini support
reasoning_effortparameter (low,medium,high) - Reasoning tokens are billed at the output token rate
- All OpenAI models support automatic prompt caching (50% discount on cache hits)
Anthropic
| Model | Best For | Context | Reasoning |
|---|---|---|---|
anthropic/claude-sonnet-4 | Coding, analysis, complex instructions | 200K | Yes (extended thinking) |
anthropic/claude-3.5-haiku | Fast responses, simple tasks | 200K | No |
Notes:
- Claude Sonnet 4 supports extended thinking with configurable
budget_tokens - Anthropic models support explicit prompt caching with
cache_control(90% discount on reads) - Available via both
/v1/chat/completionsand/v1/messagesendpoints
Google
| Model | Best For | Context | Reasoning |
|---|---|---|---|
google/gemini-2.5-pro | Long-context tasks, multimodal | 1M | Yes |
google/gemini-2.5-flash | Fast, cost-effective general use | 1M | Yes |
Notes:
- 1 million token context window — largest available
- Both models support thinking/reasoning
- Gemini 2.5 Flash offers the best price-to-performance ratio for many tasks
DeepSeek
| Model | Best For | Context | Reasoning |
|---|---|---|---|
deepseek/deepseek-v3 | General-purpose, coding | 128K | No |
deepseek/deepseek-chat | Conversational AI | 128K | No |
deepseek/deepseek-reasoner | Step-by-step reasoning | 128K | Yes |
Notes:
- Extremely cost-effective — among the cheapest models available
- DeepSeek Reasoner uses chain-of-thought reasoning (similar to o3)
- Automatic prompt caching with 90% discount on cache hits
Meta (Open Source)
| Model | Best For | Context | Reasoning |
|---|---|---|---|
meta/llama-4-maverick | General-purpose, open-source alternative | 128K | No |
Notes:
- Open-source model hosted by multiple providers
- Good balance of quality and cost
- Supports vision (image understanding)
Alibaba
| Model | Best For | Context | Reasoning |
|---|---|---|---|
alibaba/qwen3-32b | Budget tasks, multilingual | 32K | No |
Notes:
- Lowest cost model available on SandBase
- Strong multilingual capabilities (especially Chinese)
- 32K context window (smaller than other models)
ByteDance
| Model | Best For | Context | Reasoning |
|---|---|---|---|
bytedance/seed-1.6 | Multimodal, creative tasks | 128K | No |
Notes:
- Strong multimodal capabilities
- Good for creative writing and image understanding
Choosing a Model
By Use Case
| Use Case | Recommended | Why |
|---|---|---|
| Chat interface | openai/gpt-4o-mini or anthropic/claude-3.5-haiku | Fast, cheap, good quality |
| Code generation | anthropic/claude-sonnet-4 | Best coding performance |
| Complex reasoning | openai/o3 or google/gemini-2.5-pro | Deep thinking capabilities |
| Budget reasoning | openai/o3-mini or deepseek/deepseek-reasoner | Reasoning at lower cost |
| Long documents | google/gemini-2.5-pro or google/gemini-2.5-flash | 1M context window |
| High volume / classification | openai/gpt-4o-mini or alibaba/qwen3-32b | Lowest cost per token |
| Multimodal (images) | openai/gpt-4o or google/gemini-2.5-pro | Best vision capabilities |
By Budget
| Budget Level | Models | Typical Cost per 1K requests |
|---|---|---|
| Ultra-low | alibaba/qwen3-32b, openai/gpt-4o-mini | $0.01–$0.05 |
| Low | deepseek/deepseek-v3, google/gemini-2.5-flash | $0.05–$0.20 |
| Medium | anthropic/claude-3.5-haiku, openai/gpt-4o | $0.20–$2.00 |
| High | anthropic/claude-sonnet-4, openai/o3 | $2.00–$20.00 |
Model Availability
All models are available 24/7 through SandBase. If a specific provider experiences downtime, SandBase's routing system automatically falls back to alternative providers when available.
Check real-time model status on the SandBase Status Page.

