Skip to content

Audio — Text-to-Speech & Speech-to-Text

POST/v1/generations

Generate speech from text (TTS) or transcribe audio to text (STT).

Text-to-Speech

python
response = client.post("/v1/generations", json={
    "model": "openai/tts-1",
    "input": "Hello, welcome to SandBase!",
    "voice": "alloy"
})
# Returns audio URL
print(response.json()["outputs"][0]["url"])

Speech-to-Text

python
response = client.post("/v1/generations", json={
    "model": "openai/whisper-1",
    "audio_url": "https://example.com/audio.mp3"
})
print(response.json()["outputs"][0]["text"])

Available Models

ModelTypeProvider
openai/tts-1TTSOpenAI
openai/tts-1-hdTTS (HD)OpenAI
openai/whisper-1STTOpenAI
fish-audio/speech-1TTSFish Audio

Browse all audio models