API Reference
Complete endpoint documentation
Generate Speech
POST
/api/v1/ttsπ Auth requiredSubmit text for speech synthesis. Returns immediately with a job_id for polling. Supports idempotency keys for safe retries.
Request Body
| Field | Type | Description |
|---|---|---|
| text* | string | Text to convert (up to 500,000 characters) |
| voice* | string | Voice ID in provider:voiceName format (e.g., google:en-US-Chirp3-HD-Charon). Non-Google providers return PROVIDER_DISABLED (400). |
| model_type* | string | "premium" or "ultra" |
| model | string | Provider-specific ultra model ID (for model_type="ultra"). Example: gemini-2.5-flash-preview-tts. |
| prompt | string | Optional style/context prompt. Current deployment: enabled for Google + model_type="ultra" short requests. For long requests, prompt currently returns PROMPT_NOT_SUPPORTED (400). Limits: prompt β€ 4000 bytes, combined text+prompt β€ 8000 bytes. Billing uses text + prompt. |
| speed | number | Speech rate 0.25-4.0 (default: 1.0) |
| format | string | "text" or "ssml" (default: text), "markup" only for Premium Chirp voices. |
| output_format | string | "wav", "mp3", or "ogg_opus" (default: wav) |
| sample_rate_hertz | integer | Output sample rate in Hz. See Format Rules below. |
| output_bitrate_kbps | integer | Output bitrate in kbps. Only for mp3/ogg_opus. Invalid for wav. Default: mp3=128, ogg_opus=64 if omitted. |
| speaker_type | string | "single" or "multi" (ultra only) |
| voice_speaker_1 | string | First speaker voice (multi only) |
| voice_speaker_2 | string | Second speaker voice (multi only) |
| webhook_url | string | URL for completion callback (enterprise) |
| metadata | object | Custom metadata returned in webhook |
Example Request
curl -X POST https://aitts.theproductivepixel.com/api/v1/tts \
-H "Authorization: Bearer tts_YOUR_KEY" \
-H "Content-Type: application/json" \
-H "Idempotency-Key: unique-request-id" \
-d '{
"text": "Hello world!",
"voice": "google:en-US-Chirp3-HD-Charon",
"model_type": "premium"
}'Response
{
"success": true,
"data": {
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "pending",
"poll_url": "/api/v1/tts/550e8400-e29b-41d4-a716-446655440000",
"chars_charged": 12
}
}π‘ Tip: Use
Idempotency-Key header for safe retries. Same key + same request = same response.Format Rules (Google)
- wav β Sample rates: 8000, 12000, 16000, 22050, 24000, 44100, 48000 Hz. Bitrate not configurable.
- mp3 β Sample rates: 8000, 12000, 16000, 22050, 24000, 44100, 48000 Hz. Bitrates: 32, 64, 96, 128, 160, 192, 256, 320 kbps. Default bitrate: 128 kbps if omitted.
- ogg_opus β Sample rates: 8000, 12000, 16000, 24000, 48000 Hz. Bitrates: 6, 8, 12, 16, 24, 32, 48, 64, 96, 128, 160, 192, 256, 320 kbps. Default bitrate: 64 kbps if omitted.
- sample_rate_hertz β Defaults to max allowed for the format if omitted.
Allowed values are capability-driven and can change per provider/model.
Provider Note
- Voice IDs must be provider-prefixed:
google:en-US-Chirp3-HD-Charon - Non-Google providers return
PROVIDER_DISABLED(400).
Long Audio Constraint
- Google long audio synthesis uses LINEAR16 internally. Non-WAV outputs (mp3, ogg_opus) are converted server-side.
Get Job Status
GET
/api/v1/tts/:job_idπ Auth requiredCheck the status of a TTS job. Poll until status is 'completed' or 'failed'.
Example Request
curl https://aitts.theproductivepixel.com/api/v1/tts/550e8400-e29b-41d4-a716-446655440000 \
-H "Authorization: Bearer tts_YOUR_KEY"Response
{
"success": true,
"data": {
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "completed",
"created_at": "2026-01-07T10:00:00.000Z",
"audio_url": "https://storage.googleapis.com/...",
"audio_url_expires_at": "2026-01-08T10:00:00.000Z",
"chars_charged": 12
}
}Status values:
pending- Job queuedprocessing- Audio being generatedcompleted- Audio ready (check audio_url)failed- Generation failed (check error)
Status Response Fields
| Field | Type | Description |
|---|---|---|
| provider | string|null | Provider used (e.g., google). Null for old jobs. |
| model_requested | string|null | Model requested before access control. Null for premium or old jobs. |
| model_effective | string|null | Model used after access control (may differ from requested). |
| output_format | string|null | Output audio format (wav, mp3, ogg_opus). Null for old jobs. |
| sample_rate_hertz | integer|null | Output sample rate in Hz. Null if not specified. |
| output_bitrate_kbps | integer|null | Effective output bitrate in kbps (includes defaults: mp3=128, ogg_opus=64). Null for wav or old jobs. |
| prompt_bytes | integer | Prompt size in bytes. 0 if no prompt. |
| chars_charged | integer | Characters billed (text + prompt for short). |
Prompt Constraints
- Current deployment: prompt is enabled for Google short requests with
model_type="ultra". - Long requests: prompt currently returns
PROMPT_NOT_SUPPORTED(400). - Limits: text β€ 4000 bytes, prompt β€ 4000 bytes, combined β€ 8000 bytes.
- Billing: chars_charged includes text + prompt bytes for short requests.
List Voices
GET
/api/v1/voicesπ Auth requiredGet available voices. Supports filtering and pagination. Use ETag for efficient caching.
Request Body
| Field | Type | Description |
|---|---|---|
| language | query | Filter by language (e.g., en-US) |
| model_type | query | Filter by "premium" or "ultra" |
| provider | query | Filter by provider (e.g., google) |
| limit | query | Max results (default: all, max: 500) |
| offset | query | Pagination offset |
Example Request
curl "https://aitts.theproductivepixel.com/api/v1/voices?language=en-US&model_type=premium" \
-H "Authorization: Bearer tts_YOUR_KEY"Response
{
"success": true,
"data": {
"voices": [
{
"voice_id": "google:en-US-Chirp3-HD-Charon",
"provider": "google",
"language": "en-US",
"name": "en-US-Chirp3-HD-Charon",
"model_type": "premium",
"gender": "male",
"sample_url": null,
"supports_ssml": false,
"supports_markup": true,
"supports_multispeaker": false
}
],
"total_voices": 150,
"voices_version": "a1b2c3d4e5f6"
}
}Get Usage
GET
/api/v1/usageπ Auth requiredGet your API usage statistics. Response varies by account type.
Example Request
curl https://aitts.theproductivepixel.com/api/v1/usage \
-H "Authorization: Bearer tts_YOUR_KEY"Response
// Standard account
{
"success": true,
"data": {
"account_type": "standard",
"credits_balance": 50.00
}
}
// Enterprise account
{
"success": true,
"data": {
"account_type": "enterprise",
"period_start": "2026-01-01T00:00:00.000Z",
"period_end": "2026-01-31T23:59:59.999Z",
"characters_used": 5000000,
"quota_limit": 30000000,
"quota_remaining": 25000000,
"request_count": 1250,
"overage_rate_per_1m": 38
}
}Error Responses
All errors follow this format:
{
"success": false,
"error": {
"code": "ERROR_CODE",
"message": "Human-readable description"
}
}| Error | HTTP | Description |
|---|---|---|
| UNAUTHORIZED | 401 | Invalid or missing API key |
| FORBIDDEN | 403 | Key lacks required permissions |
| VALIDATION_ERROR | 400 | Invalid request parameters |
| INVALID_VOICE | 400 | Voice ID invalid or not found |
| INVALID_MODEL | 400 | Model not found or not supported |
| PROVIDER_DISABLED | 400 | Provider not currently enabled in this deployment |
| FORMAT_NOT_SUPPORTED | 400 | Output format not supported by provider |
| INVALID_SAMPLE_RATE | 400 | Sample rate not supported for format |
| INVALID_BITRATE | 400 | Bitrate not supported or invalid for format |
| PROMPT_NOT_SUPPORTED | 400 | Prompt not enabled for this request path/provider/model in the current deployment. |
| PROMPT_TOO_LONG | 400 | Prompt exceeds 4,000 byte limit |
| COMBINED_TOO_LONG | 400 | Text + prompt exceeds 8,000 byte combined limit |
| INVALID_CONTENT_TYPE | 415 | Content-Type must be application/json |
| REQUEST_IN_PROGRESS | 409 | Request with this idempotency key is still processing |
| IDEMPOTENCY_KEY_REUSE | 409 | Idempotency key already used with different request |
| INSUFFICIENT_CREDITS | 402 | Not enough credits for request |
| QUOTA_EXCEEDED | 402 | Enterprise quota exceeded |
| RATE_LIMIT_EXCEEDED | 429 | Too many requests |
| JOB_NOT_FOUND | 404 | Job ID not found or not owned by you |
| GENERATION_FAILED | 500 | Audio generation failed |
| MAINTENANCE | 503 | API temporarily disabled. Retry after 300s. |
Rate Limits
| Tier | Requests | Window |
|---|---|---|
| Free | 10 | 15 minutes |
| Premium | 100 | 15 minutes |
| Enterprise | 1,000 | 15 minutes |
Rate limit headers included in all responses: X-RateLimit-Remaining, X-RateLimit-Reset
Back to Documentation
Β© 2026 AI TTS Microservice. All rights reserved.