Your First API Call

This guide walks through the anatomy of a SandBase API request and response in detail. By the end, you'll understand every field in the request body, how to read the response, and how to use streaming.

Request Anatomy

Every request to SandBase's LLM Gateway follows this structure:

bash

POST https://api.sandbase.ai/v1/chat/completions

Required Headers

Header	Value	Description
`Authorization`	`Bearer sk-sb-YOUR_API_KEY`	Your SandBase API key
`Content-Type`	`application/json`	Request body format

Request Body

json

{
  "model": "deepseek/deepseek-v3",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is the capital of France?"}
  ],
  "temperature": 0.7,
  "max_tokens": 256
}

Body Parameters

Parameter	Type	Required	Description
`model`	string	Yes	The model to use (e.g., `deepseek/deepseek-v3`, `openai/gpt-4o`, `anthropic/claude-sonnet-4-20250514`)
`messages`	array	Yes	Conversation history as an array of message objects
`temperature`	number	No	Sampling temperature (0–2). Lower = more deterministic. Default varies by model.
`max_tokens`	integer	No	Maximum tokens to generate in the response
`top_p`	number	No	Nucleus sampling parameter (0–1)
`stream`	boolean	No	Whether to stream the response. Default: `false`
`stop`	string or array	No	Stop sequences — generation stops when these are encountered
`frequency_penalty`	number	No	Penalize repeated tokens (-2 to 2)
`presence_penalty`	number	No	Penalize tokens already in the conversation (-2 to 2)

Message Roles

Role	Purpose
`system`	Sets the assistant's behavior and personality
`user`	The human's input
`assistant`	Previous assistant responses (for multi-turn conversations)

Full Request Example

cURLPythonJavaScript

bash

curl https://api.sandbase.ai/v1/chat/completions \
  -H "Authorization: Bearer sk-sb-YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek/deepseek-v3",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is the capital of France?"}
    ],
    "temperature": 0.7,
    "max_tokens": 256
  }'

python

from openai import OpenAI

client = OpenAI(
    api_key="sk-sb-YOUR_API_KEY",
    base_url="https://api.sandbase.ai/v1"
)

response = client.chat.completions.create(
    model="deepseek/deepseek-v3",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the capital of France?"}
    ],
    temperature=0.7,
    max_tokens=256
)

print(response.choices[0].message.content)

javascript

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'sk-sb-YOUR_API_KEY',
  baseURL: 'https://api.sandbase.ai/v1',
});

const response = await client.chat.completions.create({
  model: 'deepseek/deepseek-v3',
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'What is the capital of France?' },
  ],
  temperature: 0.7,
  max_tokens: 256,
});

console.log(response.choices[0].message.content);

Response Structure

Non-Streaming Response

json

{
  "id": "chatcmpl-abc123def456",
  "object": "chat.completion",
  "created": 1719000000,
  "model": "deepseek/deepseek-v3",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of France is Paris."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 24,
    "completion_tokens": 8,
    "total_tokens": 32
  }
}

Response Fields Explained

Field	Description
`id`	Unique identifier for this completion
`object`	Always `"chat.completion"` for non-streaming responses
`created`	Unix timestamp of when the response was generated
`model`	The model that generated the response
`choices`	Array of completion choices (typically one)
`choices[].index`	Index of this choice in the array
`choices[].message.role`	Always `"assistant"`
`choices[].message.content`	The generated text
`choices[].finish_reason`	Why generation stopped (see below)
`usage.prompt_tokens`	Tokens in your input
`usage.completion_tokens`	Tokens in the generated output
`usage.total_tokens`	Sum of prompt + completion tokens

Finish Reasons

Value	Meaning
`stop`	Natural end of response or hit a stop sequence
`length`	Hit `max_tokens` limit — response was truncated
`content_filter`	Content was filtered by safety systems

Streaming Responses

For real-time output (like a chatbot typing), use streaming. The response arrives as Server-Sent Events (SSE):

PythonJavaScriptcURL

python

from openai import OpenAI

client = OpenAI(
    api_key="sk-sb-YOUR_API_KEY",
    base_url="https://api.sandbase.ai/v1"
)

stream = client.chat.completions.create(
    model="deepseek/deepseek-v3",
    messages=[{"role": "user", "content": "Write a haiku about coding."}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)
print()  # newline at the end

javascript

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'sk-sb-YOUR_API_KEY',
  baseURL: 'https://api.sandbase.ai/v1',
});

const stream = await client.chat.completions.create({
  model: 'deepseek/deepseek-v3',
  messages: [{ role: 'user', content: 'Write a haiku about coding.' }],
  stream: true,
});

for await (const chunk of stream) {
  const content = chunk.choices[0]?.delta?.content;
  if (content) process.stdout.write(content);
}
console.log();

bash

curl https://api.sandbase.ai/v1/chat/completions \
  -H "Authorization: Bearer sk-sb-YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -N \
  -d '{
    "model": "deepseek/deepseek-v3",
    "messages": [{"role": "user", "content": "Write a haiku about coding."}],
    "stream": true
  }'

Streaming SSE Format

Each chunk arrives as a Server-Sent Event:

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1719000000,"model":"deepseek/deepseek-v3","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1719000000,"model":"deepseek/deepseek-v3","choices":[{"index":0,"delta":{"content":"The"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1719000000,"model":"deepseek/deepseek-v3","choices":[{"index":0,"delta":{"content":" capital"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1719000000,"model":"deepseek/deepseek-v3","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]

Key points:

The first chunk contains delta.role indicating the assistant is responding
Subsequent chunks contain delta.content with text fragments
The final chunk has finish_reason set and empty delta
The stream ends with data: [DONE]

Using the Anthropic SDK

SandBase also exposes an Anthropic-compatible endpoint at POST /v1/messages. Use the Anthropic SDK by changing the base_url:

PythonJavaScript

python

import anthropic

client = anthropic.Anthropic(
    api_key="sk-sb-YOUR_API_KEY",
    base_url="https://api.sandbase.ai"
)

message = client.messages.create(
    model="anthropic/claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "What is the capital of France?"}
    ]
)
print(message.content[0].text)

javascript

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic({
  apiKey: 'sk-sb-YOUR_API_KEY',
  baseURL: 'https://api.sandbase.ai',
});

const message = await client.messages.create({
  model: 'anthropic/claude-sonnet-4-20250514',
  max_tokens: 1024,
  messages: [{ role: 'user', content: 'What is the capital of France?' }],
});
console.log(message.content[0].text);

Anthropic Response Structure

The Anthropic-compatible endpoint returns responses in Anthropic's format:

json

{
  "id": "msg_abc123",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "The capital of France is Paris."
    }
  ],
  "model": "anthropic/claude-sonnet-4-20250514",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 14,
    "output_tokens": 8
  }
}

Field	Description
`id`	Message ID (prefixed with `msg_`)
`type`	Always `"message"`
`role`	Always `"assistant"`
`content`	Array of content blocks (text blocks)
`model`	The model that generated the response
`stop_reason`	`"end_turn"` (natural stop), `"max_tokens"` (hit limit), or `"stop_sequence"`
`usage.input_tokens`	Tokens in your input
`usage.output_tokens`	Tokens in the generated output

Anthropic Streaming

Streaming with the Anthropic SDK works the same way — just pass stream=True:

python

import anthropic

client = anthropic.Anthropic(
    api_key="sk-sb-YOUR_API_KEY",
    base_url="https://api.sandbase.ai"
)

with client.messages.stream(
    model="anthropic/claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Write a haiku about coding."}]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)
print()

Choosing a Model

When selecting a model, consider:

Speed: DeepSeek V3 and GPT-4o Mini are fast and cheap for simple tasks
Quality: Claude Sonnet and GPT-4o excel at complex reasoning
Cost: Check the Models page for per-token pricing
Context window: Some models support up to 200K tokens of context

You can switch models by changing the model parameter — no other code changes needed.

Next Steps

API Reference — Full endpoint documentation with all parameters
Streaming Guide — Advanced streaming patterns and error handling
Models — Complete list of available models with pricing
Error Handling — How to handle errors and implement retries

Your First API Call ​

Request Anatomy ​

Required Headers ​

Request Body ​

Body Parameters ​

Message Roles ​

Full Request Example ​

Response Structure ​

Non-Streaming Response ​

Response Fields Explained ​

Finish Reasons ​

Streaming Responses ​

Streaming SSE Format ​

Using the Anthropic SDK ​

Anthropic Response Structure ​

Anthropic Streaming ​

Choosing a Model ​

Next Steps ​

Your First API Call

Request Anatomy

Required Headers

Request Body

Body Parameters

Message Roles

Full Request Example

Response Structure

Non-Streaming Response

Response Fields Explained

Finish Reasons

Streaming Responses

Streaming SSE Format

Using the Anthropic SDK

Anthropic Response Structure

Anthropic Streaming

Choosing a Model

Next Steps