Skip to content

Supported Models

SandBase provides access to 1400+ models from 50+ providers through a single API. All models are available via the OpenAI-compatible endpoint (/v1/chat/completions) and, where applicable, the Anthropic-compatible endpoint (/v1/messages).

TIP

Browse the full model catalog with live pricing at sandbase.ai/models.

Model Comparison Table

ModelProviderInput (per 1M tokens)Output (per 1M tokens)Context WindowKey Capabilities
openai/gpt-4oOpenAI$2.50$10.00128KChat, Streaming, Tools, Vision, JSON Mode, Cache
openai/gpt-4o-miniOpenAI$0.15$0.60128KChat, Streaming, Tools, Vision, JSON Mode, Cache
openai/o3OpenAI$10.00$40.00200KChat, Streaming, Tools, Vision, Thinking, JSON Mode
openai/o3-miniOpenAI$1.10$4.40200KChat, Streaming, Tools, Thinking, JSON Mode
anthropic/claude-sonnet-4Anthropic$3.00$15.00200KChat, Streaming, Tools, Vision, Thinking, Cache
anthropic/claude-3.5-haikuAnthropic$0.80$4.00200KChat, Streaming, Tools, Vision, Cache
deepseek/deepseek-v3DeepSeek$0.27$1.10128KChat, Streaming, Tools, JSON Mode
deepseek/deepseek-chatDeepSeek$0.27$1.10128KChat, Streaming, Tools, JSON Mode
deepseek/deepseek-reasonerDeepSeek$0.55$2.19128KChat, Streaming, Thinking
google/gemini-2.5-proGoogle$1.25$10.001MChat, Streaming, Tools, Vision, Thinking, JSON Mode, Cache
google/gemini-2.5-flashGoogle$0.15$0.601MChat, Streaming, Tools, Vision, Thinking, JSON Mode, Cache
meta/llama-4-maverickMeta$0.20$0.60128KChat, Streaming, Tools, Vision
alibaba/qwen3-32bAlibaba$0.10$0.3032KChat, Streaming, Tools
bytedance/seed-1.6ByteDance$0.50$2.00128KChat, Streaming, Tools, Vision

TIP

Prices are in USD and reflect SandBase's pass-through pricing. Actual costs may vary slightly based on provider pricing changes.

Models by Provider

OpenAI

ModelBest ForContextReasoning
openai/gpt-4oGeneral-purpose, multimodal tasks128KNo
openai/gpt-4o-miniSimple tasks, high-volume workloads128KNo
openai/o3Complex reasoning, math, coding200KYes
openai/o3-miniBudget reasoning tasks200KYes

Notes:

  • o3 and o3-mini support reasoning_effort parameter (low, medium, high)
  • Reasoning tokens are billed at the output token rate
  • All OpenAI models support automatic prompt caching (50% discount on cache hits)

Anthropic

ModelBest ForContextReasoning
anthropic/claude-sonnet-4Coding, analysis, complex instructions200KYes (extended thinking)
anthropic/claude-3.5-haikuFast responses, simple tasks200KNo

Notes:

  • Claude Sonnet 4 supports extended thinking with configurable budget_tokens
  • Anthropic models support explicit prompt caching with cache_control (90% discount on reads)
  • Available via both /v1/chat/completions and /v1/messages endpoints

Google

ModelBest ForContextReasoning
google/gemini-2.5-proLong-context tasks, multimodal1MYes
google/gemini-2.5-flashFast, cost-effective general use1MYes

Notes:

  • 1 million token context window — largest available
  • Both models support thinking/reasoning
  • Gemini 2.5 Flash offers the best price-to-performance ratio for many tasks

DeepSeek

ModelBest ForContextReasoning
deepseek/deepseek-v3General-purpose, coding128KNo
deepseek/deepseek-chatConversational AI128KNo
deepseek/deepseek-reasonerStep-by-step reasoning128KYes

Notes:

  • Extremely cost-effective — among the cheapest models available
  • DeepSeek Reasoner uses chain-of-thought reasoning (similar to o3)
  • Automatic prompt caching with 90% discount on cache hits

Meta (Open Source)

ModelBest ForContextReasoning
meta/llama-4-maverickGeneral-purpose, open-source alternative128KNo

Notes:

  • Open-source model hosted by multiple providers
  • Good balance of quality and cost
  • Supports vision (image understanding)

Alibaba

ModelBest ForContextReasoning
alibaba/qwen3-32bBudget tasks, multilingual32KNo

Notes:

  • Lowest cost model available on SandBase
  • Strong multilingual capabilities (especially Chinese)
  • 32K context window (smaller than other models)

ByteDance

ModelBest ForContextReasoning
bytedance/seed-1.6Multimodal, creative tasks128KNo

Notes:

  • Strong multimodal capabilities
  • Good for creative writing and image understanding

Choosing a Model

By Use Case

Use CaseRecommendedWhy
Chat interfaceopenai/gpt-4o-mini or anthropic/claude-3.5-haikuFast, cheap, good quality
Code generationanthropic/claude-sonnet-4Best coding performance
Complex reasoningopenai/o3 or google/gemini-2.5-proDeep thinking capabilities
Budget reasoningopenai/o3-mini or deepseek/deepseek-reasonerReasoning at lower cost
Long documentsgoogle/gemini-2.5-pro or google/gemini-2.5-flash1M context window
High volume / classificationopenai/gpt-4o-mini or alibaba/qwen3-32bLowest cost per token
Multimodal (images)openai/gpt-4o or google/gemini-2.5-proBest vision capabilities

By Budget

Budget LevelModelsTypical Cost per 1K requests
Ultra-lowalibaba/qwen3-32b, openai/gpt-4o-mini$0.01–$0.05
Lowdeepseek/deepseek-v3, google/gemini-2.5-flash$0.05–$0.20
Mediumanthropic/claude-3.5-haiku, openai/gpt-4o$0.20–$2.00
Highanthropic/claude-sonnet-4, openai/o3$2.00–$20.00

Model Availability

All models are available 24/7 through SandBase. If a specific provider experiences downtime, SandBase's routing system automatically falls back to alternative providers when available.

Check real-time model status on the SandBase Status Page.