API Reference

Complete endpoint documentation

Generate Speech

POST/api/v1/tts🔒 Auth required

Submit text for speech synthesis. Returns immediately with a job_id for polling. Supports idempotency keys for safe retries.

Request Body

Field	Type	Description
text*	string	Text to convert (up to 500,000 characters)
voice*	string	Voice ID in provider:voiceName format (e.g., google:en-US-Chirp3-HD-Charon). Non-Google providers return PROVIDER_DISABLED (400).
model_type*	string	"premium" or "ultra"
model	string	Provider-specific ultra model ID (for model_type="ultra"). Example: gemini-2.5-flash-preview-tts.
prompt	string	Optional style/context prompt. Current deployment: enabled for Google + model_type="ultra" short requests. For long requests, prompt currently returns PROMPT_NOT_SUPPORTED (400). Limits: prompt ≤ 4000 bytes, combined text+prompt ≤ 8000 bytes. Billing uses text + prompt.
speed	number	Speech rate 0.25-4.0 (default: 1.0)
format	string	"text" or "ssml" (default: text), "markup" only for Premium Chirp voices.
output_format	string	"wav", "mp3", or "ogg_opus" (default: wav)
sample_rate_hertz	integer	Output sample rate in Hz. See Format Rules below.
output_bitrate_kbps	integer	Output bitrate in kbps. Only for mp3/ogg_opus. Invalid for wav. Default: mp3=128, ogg_opus=64 if omitted.
speaker_type	string	"single" or "multi" (ultra only)
voice_speaker_1	string	First speaker voice (multi only)
voice_speaker_2	string	Second speaker voice (multi only)
webhook_url	string	URL for completion callback (enterprise)
metadata	object	Custom metadata returned in webhook

Example Request

curl -X POST https://aitts.theproductivepixel.com/api/v1/tts \
  -H "Authorization: Bearer tts_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: unique-request-id" \
  -d '{
    "text": "Hello world!",
    "voice": "google:en-US-Chirp3-HD-Charon",
    "model_type": "premium"
  }'

Response

{
  "success": true,
  "data": {
    "job_id": "550e8400-e29b-41d4-a716-446655440000",
    "status": "pending",
    "poll_url": "/api/v1/tts/550e8400-e29b-41d4-a716-446655440000",
    "chars_charged": 12
  }
}

💡 Tip: Use Idempotency-Key header for safe retries. Same key + same request = same response.

Format Rules (Google)

wav — Sample rates: 8000, 12000, 16000, 22050, 24000, 44100, 48000 Hz. Bitrate not configurable.
mp3 — Sample rates: 8000, 12000, 16000, 22050, 24000, 44100, 48000 Hz. Bitrates: 32, 64, 96, 128, 160, 192, 256, 320 kbps. Default bitrate: 128 kbps if omitted.
ogg_opus — Sample rates: 8000, 12000, 16000, 24000, 48000 Hz. Bitrates: 6, 8, 12, 16, 24, 32, 48, 64, 96, 128, 160, 192, 256, 320 kbps. Default bitrate: 64 kbps if omitted.
sample_rate_hertz — Defaults to max allowed for the format if omitted.

Allowed values are capability-driven and can change per provider/model.

Provider Note

Voice IDs must be provider-prefixed: google:en-US-Chirp3-HD-Charon
Non-Google providers return PROVIDER_DISABLED (400).

Long Audio Constraint

Google long audio synthesis uses LINEAR16 internally. Non-WAV outputs (mp3, ogg_opus) are converted server-side.

Get Job Status

GET/api/v1/tts/:job_id🔒 Auth required

Check the status of a TTS job. Poll until status is 'completed' or 'failed'.

Example Request

curl https://aitts.theproductivepixel.com/api/v1/tts/550e8400-e29b-41d4-a716-446655440000 \
  -H "Authorization: Bearer tts_YOUR_KEY"

Response

{
  "success": true,
  "data": {
    "job_id": "550e8400-e29b-41d4-a716-446655440000",
    "status": "completed",
    "created_at": "2026-01-07T10:00:00.000Z",
    "audio_url": "https://storage.googleapis.com/...",
    "audio_url_expires_at": "2026-01-08T10:00:00.000Z",
    "chars_charged": 12
  }
}

Status values:

pending - Job queued
processing - Audio being generated
completed - Audio ready (check audio_url)
failed - Generation failed (check error)

Status Response Fields

Field	Type	Description
provider	string\|null	Provider used (e.g., google). Null for old jobs.
model_requested	string\|null	Model requested before access control. Null for premium or old jobs.
model_effective	string\|null	Model used after access control (may differ from requested).
output_format	string\|null	Output audio format (wav, mp3, ogg_opus). Null for old jobs.
sample_rate_hertz	integer\|null	Output sample rate in Hz. Null if not specified.
output_bitrate_kbps	integer\|null	Effective output bitrate in kbps (includes defaults: mp3=128, ogg_opus=64). Null for wav or old jobs.
prompt_bytes	integer	Prompt size in bytes. 0 if no prompt.
chars_charged	integer	Characters billed (text + prompt for short).

Prompt Constraints

Current deployment: prompt is enabled for Google short requests with model_type="ultra".
Long requests: prompt currently returns PROMPT_NOT_SUPPORTED (400).
Limits: text ≤ 4000 bytes, prompt ≤ 4000 bytes, combined ≤ 8000 bytes.
Billing: chars_charged includes text + prompt bytes for short requests.

List Voices

GET/api/v1/voices🔒 Auth required

Get available voices. Supports filtering and pagination. Use ETag for efficient caching.

Request Body

Field	Type	Description
language	query	Filter by language (e.g., en-US)
model_type	query	Filter by "premium" or "ultra"
provider	query	Filter by provider (e.g., google)
limit	query	Max results (default: all, max: 500)
offset	query	Pagination offset

Example Request

curl "https://aitts.theproductivepixel.com/api/v1/voices?language=en-US&model_type=premium" \
  -H "Authorization: Bearer tts_YOUR_KEY"

Response

{
  "success": true,
  "data": {
    "voices": [
      {
        "voice_id": "google:en-US-Chirp3-HD-Charon",
        "provider": "google",
        "language": "en-US",
        "name": "en-US-Chirp3-HD-Charon",
        "model_type": "premium",
        "gender": "male",
        "sample_url": null,
        "supports_ssml": false,
        "supports_markup": true,
        "supports_multispeaker": false
      }
    ],
    "total_voices": 150,
    "voices_version": "a1b2c3d4e5f6"
  }
}

Get Usage

GET/api/v1/usage🔒 Auth required

Get your API usage statistics. Response varies by account type.

Example Request

curl https://aitts.theproductivepixel.com/api/v1/usage \
  -H "Authorization: Bearer tts_YOUR_KEY"

Response

// Standard account
{
  "success": true,
  "data": {
    "account_type": "standard",
    "credits_balance": 50.00
  }
}

// Enterprise account
{
  "success": true,
  "data": {
    "account_type": "enterprise",
    "period_start": "2026-01-01T00:00:00.000Z",
    "period_end": "2026-01-31T23:59:59.999Z",
    "characters_used": 5000000,
    "quota_limit": 30000000,
    "quota_remaining": 25000000,
    "request_count": 1250,
    "overage_rate_per_1m": 38
  }
}

Error Responses

All errors follow this format:

{
  "success": false,
  "error": {
    "code": "ERROR_CODE",
    "message": "Human-readable description"
  }
}

Error	HTTP	Description
UNAUTHORIZED	401	Invalid or missing API key
FORBIDDEN	403	Key lacks required permissions
VALIDATION_ERROR	400	Invalid request parameters
INVALID_VOICE	400	Voice ID invalid or not found
INVALID_MODEL	400	Model not found or not supported
PROVIDER_DISABLED	400	Provider not currently enabled in this deployment
FORMAT_NOT_SUPPORTED	400	Output format not supported by provider
INVALID_SAMPLE_RATE	400	Sample rate not supported for format
INVALID_BITRATE	400	Bitrate not supported or invalid for format
PROMPT_NOT_SUPPORTED	400	Prompt not enabled for this request path/provider/model in the current deployment.
PROMPT_TOO_LONG	400	Prompt exceeds 4,000 byte limit
COMBINED_TOO_LONG	400	Text + prompt exceeds 8,000 byte combined limit
INVALID_CONTENT_TYPE	415	Content-Type must be application/json
REQUEST_IN_PROGRESS	409	Request with this idempotency key is still processing
IDEMPOTENCY_KEY_REUSE	409	Idempotency key already used with different request
INSUFFICIENT_CREDITS	402	Not enough credits for request
QUOTA_EXCEEDED	402	Enterprise quota exceeded
RATE_LIMIT_EXCEEDED	429	Too many requests
JOB_NOT_FOUND	404	Job ID not found or not owned by you
GENERATION_FAILED	500	Audio generation failed
MAINTENANCE	503	API temporarily disabled. Retry after 300s.

Rate Limits

Tier	Requests	Window
Free	10	15 minutes
Premium	100	15 minutes
Enterprise	1,000	15 minutes

Rate limit headers included in all responses: X-RateLimit-Remaining, X-RateLimit-Reset

Back to Documentation