Speech-to-Text

Endpoint

POST /v1/audio/transcriptions

Transcribes audio to text with automatic provider fallback. The API is OpenAI-compatible — any client that works with OpenAI’s audio transcriptions API can point to this gateway instead.

Provider Fallback Chain

The gateway tries each provider in order until one succeeds:

Groq Whisper (primary) — whisper-large-v3-turbo, whisper-large-v3. Fastest, best quality on free tier.
Cloudflare Workers AI (fallback) — @cf/openai/whisper. No external key required.
Gemini audio understanding (last resort) — gemini-2.5-flash with audio input. Useful when Groq is rate-limited and you still need transcription.

With model: "auto" (or omitted), the gateway picks the healthiest provider based on live success rate and remaining daily headroom. You can also pin a specific model.

Request

multipart/form-data

Field	Type	Default	Description
`file`	file	required	Audio file to transcribe. Supported formats: mp3, mp4, wav, webm, m4a.
`model`	string	`whisper-large-v3-turbo`	Whisper / audio model. Use `auto` for health-aware fallback across Groq → Workers AI → Gemini.
`language`	string	—	ISO-639-1 language code (e.g. `en`, `es`, `fr`). Improves accuracy when specified.

Response

{
  "text": "The transcribed text appears here."
}

Free Limits

2,000 requests/day (Groq free tier)
8 hours of audio/day (Groq free tier)

Examples

curl https://your-gateway.workers.dev/v1/audio/transcriptions \
  -F file=@recording.mp3 \
  -F model=whisper-large-v3-turbo \
  -F language=en

const formData = new FormData();
formData.append('file', audioFile);
formData.append('model', 'whisper-large-v3-turbo');

const response = await fetch('https://your-gateway.workers.dev/v1/audio/transcriptions', {
  method: 'POST',
  body: formData,
});

const { text } = await response.json();
console.log(text);

import OpenAI from 'openai';
import fs from 'fs';

const client = new OpenAI({
  baseURL: 'https://your-gateway.workers.dev/v1',
  apiKey: 'anything',
});

const transcription = await client.audio.transcriptions.create({
  file: fs.createReadStream('recording.mp3'),
  model: 'whisper-large-v3-turbo',
});

console.log(transcription.text);