DocsAPI Reference

API Reference

Complete endpoint documentation

Generate Speech

POST/api/v1/ttsπŸ”’ Auth required

Submit text for speech synthesis. Returns immediately with a job_id for polling. Supports idempotency keys for safe retries.

Request Body

FieldTypeDescription
text*stringText to convert (up to 500,000 characters)
voice*stringVoice ID in provider:voiceName format (e.g., google:en-US-Chirp3-HD-Charon). Non-Google providers return PROVIDER_DISABLED (400).
model_type*string"premium" or "ultra"
modelstringProvider-specific ultra model ID (for model_type="ultra"). Example: gemini-2.5-flash-preview-tts.
promptstringOptional style/context prompt. Current deployment: enabled for Google + model_type="ultra" short requests. For long requests, prompt currently returns PROMPT_NOT_SUPPORTED (400). Limits: prompt ≀ 4000 bytes, combined text+prompt ≀ 8000 bytes. Billing uses text + prompt.
speednumberSpeech rate 0.25-4.0 (default: 1.0)
formatstring"text" or "ssml" (default: text), "markup" only for Premium Chirp voices.
output_formatstring"wav", "mp3", or "ogg_opus" (default: wav)
sample_rate_hertzintegerOutput sample rate in Hz. See Format Rules below.
output_bitrate_kbpsintegerOutput bitrate in kbps. Only for mp3/ogg_opus. Invalid for wav. Default: mp3=128, ogg_opus=64 if omitted.
speaker_typestring"single" or "multi" (ultra only)
voice_speaker_1stringFirst speaker voice (multi only)
voice_speaker_2stringSecond speaker voice (multi only)
webhook_urlstringURL for completion callback (enterprise)
metadataobjectCustom metadata returned in webhook
Example Request
curl -X POST https://aitts.theproductivepixel.com/api/v1/tts \
  -H "Authorization: Bearer tts_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: unique-request-id" \
  -d '{
    "text": "Hello world!",
    "voice": "google:en-US-Chirp3-HD-Charon",
    "model_type": "premium"
  }'
Response
{
  "success": true,
  "data": {
    "job_id": "550e8400-e29b-41d4-a716-446655440000",
    "status": "pending",
    "poll_url": "/api/v1/tts/550e8400-e29b-41d4-a716-446655440000",
    "chars_charged": 12
  }
}
πŸ’‘ Tip: Use Idempotency-Key header for safe retries. Same key + same request = same response.

Format Rules (Google)

  • wav β€” Sample rates: 8000, 12000, 16000, 22050, 24000, 44100, 48000 Hz. Bitrate not configurable.
  • mp3 β€” Sample rates: 8000, 12000, 16000, 22050, 24000, 44100, 48000 Hz. Bitrates: 32, 64, 96, 128, 160, 192, 256, 320 kbps. Default bitrate: 128 kbps if omitted.
  • ogg_opus β€” Sample rates: 8000, 12000, 16000, 24000, 48000 Hz. Bitrates: 6, 8, 12, 16, 24, 32, 48, 64, 96, 128, 160, 192, 256, 320 kbps. Default bitrate: 64 kbps if omitted.
  • sample_rate_hertz β€” Defaults to max allowed for the format if omitted.

Allowed values are capability-driven and can change per provider/model.

Provider Note

  • Voice IDs must be provider-prefixed: google:en-US-Chirp3-HD-Charon
  • Non-Google providers return PROVIDER_DISABLED (400).

Long Audio Constraint

  • Google long audio synthesis uses LINEAR16 internally. Non-WAV outputs (mp3, ogg_opus) are converted server-side.

Get Job Status

GET/api/v1/tts/:job_idπŸ”’ Auth required

Check the status of a TTS job. Poll until status is 'completed' or 'failed'.

Example Request
curl https://aitts.theproductivepixel.com/api/v1/tts/550e8400-e29b-41d4-a716-446655440000 \
  -H "Authorization: Bearer tts_YOUR_KEY"
Response
{
  "success": true,
  "data": {
    "job_id": "550e8400-e29b-41d4-a716-446655440000",
    "status": "completed",
    "created_at": "2026-01-07T10:00:00.000Z",
    "audio_url": "https://storage.googleapis.com/...",
    "audio_url_expires_at": "2026-01-08T10:00:00.000Z",
    "chars_charged": 12
  }
}
Status values:
  • pending - Job queued
  • processing - Audio being generated
  • completed - Audio ready (check audio_url)
  • failed - Generation failed (check error)

Status Response Fields

FieldTypeDescription
providerstring|nullProvider used (e.g., google). Null for old jobs.
model_requestedstring|nullModel requested before access control. Null for premium or old jobs.
model_effectivestring|nullModel used after access control (may differ from requested).
output_formatstring|nullOutput audio format (wav, mp3, ogg_opus). Null for old jobs.
sample_rate_hertzinteger|nullOutput sample rate in Hz. Null if not specified.
output_bitrate_kbpsinteger|nullEffective output bitrate in kbps (includes defaults: mp3=128, ogg_opus=64). Null for wav or old jobs.
prompt_bytesintegerPrompt size in bytes. 0 if no prompt.
chars_chargedintegerCharacters billed (text + prompt for short).

Prompt Constraints

  • Current deployment: prompt is enabled for Google short requests with model_type="ultra".
  • Long requests: prompt currently returns PROMPT_NOT_SUPPORTED (400).
  • Limits: text ≀ 4000 bytes, prompt ≀ 4000 bytes, combined ≀ 8000 bytes.
  • Billing: chars_charged includes text + prompt bytes for short requests.

List Voices

GET/api/v1/voicesπŸ”’ Auth required

Get available voices. Supports filtering and pagination. Use ETag for efficient caching.

Request Body

FieldTypeDescription
languagequeryFilter by language (e.g., en-US)
model_typequeryFilter by "premium" or "ultra"
providerqueryFilter by provider (e.g., google)
limitqueryMax results (default: all, max: 500)
offsetqueryPagination offset
Example Request
curl "https://aitts.theproductivepixel.com/api/v1/voices?language=en-US&model_type=premium" \
  -H "Authorization: Bearer tts_YOUR_KEY"
Response
{
  "success": true,
  "data": {
    "voices": [
      {
        "voice_id": "google:en-US-Chirp3-HD-Charon",
        "provider": "google",
        "language": "en-US",
        "name": "en-US-Chirp3-HD-Charon",
        "model_type": "premium",
        "gender": "male",
        "sample_url": null,
        "supports_ssml": false,
        "supports_markup": true,
        "supports_multispeaker": false
      }
    ],
    "total_voices": 150,
    "voices_version": "a1b2c3d4e5f6"
  }
}

Get Usage

GET/api/v1/usageπŸ”’ Auth required

Get your API usage statistics. Response varies by account type.

Example Request
curl https://aitts.theproductivepixel.com/api/v1/usage \
  -H "Authorization: Bearer tts_YOUR_KEY"
Response
// Standard account
{
  "success": true,
  "data": {
    "account_type": "standard",
    "credits_balance": 50.00
  }
}

// Enterprise account
{
  "success": true,
  "data": {
    "account_type": "enterprise",
    "period_start": "2026-01-01T00:00:00.000Z",
    "period_end": "2026-01-31T23:59:59.999Z",
    "characters_used": 5000000,
    "quota_limit": 30000000,
    "quota_remaining": 25000000,
    "request_count": 1250,
    "overage_rate_per_1m": 38
  }
}

Error Responses

All errors follow this format:

{
  "success": false,
  "error": {
    "code": "ERROR_CODE",
    "message": "Human-readable description"
  }
}
ErrorHTTPDescription
UNAUTHORIZED401Invalid or missing API key
FORBIDDEN403Key lacks required permissions
VALIDATION_ERROR400Invalid request parameters
INVALID_VOICE400Voice ID invalid or not found
INVALID_MODEL400Model not found or not supported
PROVIDER_DISABLED400Provider not currently enabled in this deployment
FORMAT_NOT_SUPPORTED400Output format not supported by provider
INVALID_SAMPLE_RATE400Sample rate not supported for format
INVALID_BITRATE400Bitrate not supported or invalid for format
PROMPT_NOT_SUPPORTED400Prompt not enabled for this request path/provider/model in the current deployment.
PROMPT_TOO_LONG400Prompt exceeds 4,000 byte limit
COMBINED_TOO_LONG400Text + prompt exceeds 8,000 byte combined limit
INVALID_CONTENT_TYPE415Content-Type must be application/json
REQUEST_IN_PROGRESS409Request with this idempotency key is still processing
IDEMPOTENCY_KEY_REUSE409Idempotency key already used with different request
INSUFFICIENT_CREDITS402Not enough credits for request
QUOTA_EXCEEDED402Enterprise quota exceeded
RATE_LIMIT_EXCEEDED429Too many requests
JOB_NOT_FOUND404Job ID not found or not owned by you
GENERATION_FAILED500Audio generation failed
MAINTENANCE503API temporarily disabled. Retry after 300s.

Rate Limits

TierRequestsWindow
Free1015 minutes
Premium10015 minutes
Enterprise1,00015 minutes

Rate limit headers included in all responses: X-RateLimit-Remaining, X-RateLimit-Reset

Back to Documentation

Β© 2026 AI TTS Microservice. All rights reserved.