Create
Audio Studio
Audio Studio combines three tools in one page: Text to Speech, Speech to Speech, and Voice Design. Switch between modes using the pill buttons at the top of the left panel.
All Audio Studio features require a paid plan (Starter, Pro, or Business). Free accounts will see an upgrade prompt.
Mode 1 — Text to Speech
1 creditper 100 characters
Convert any written script into natural-sounding speech using 15+ multilingual AI voices.
- 1Navigate to Create → Audio in the sidebar and select Text to Speech.
- 2Type or paste your script (minimum 3 characters, maximum 5,000 characters). The credit counter updates in real time.
- 3Select a voice from the dropdown — stock voices show gender and accent. Your saved custom voices appear under My Voices.
- 4Click the play icon next to the dropdown to preview a stock voice before committing.
- 5Click Generate Audio. The job appears in the queue on the right.
- 6Once complete, the MP3 plays inline. Click Download MP3 to save the file.
Generate TTS audio, then feed it into a Lip Sync job to produce a talking-head video from a single portrait photo — no camera needed.
Mode 2 — Speech to Speech
15 creditsper conversion
Upload any audio recording and convert the voice inside it to a completely different AI voice — same timing, same delivery, new voice identity.
- 1Select Speech to Speech from the mode selector.
- 2Upload an audio file (MP3, WAV, M4A, or WebM). Maximum duration is 3 minutes.
- 3Select a Target Voice — choose any stock voice or one of your saved custom voices.
- 4Click Convert Voice (15 credits). The job is queued and typically completes in 30–60 seconds.
- 5Once complete, the converted MP3 plays inline and can be downloaded.
Speech to Speech is charged at a flat 15 credits regardless of audio length (up to the 3-minute cap). Credits are refunded automatically if the conversion fails.
Mode 3 — Voice Design
Freeto generate previews
Describe a voice in plain text and AI generates three unique voice previews for you to audition. Save your favourite as a custom voice and use it in Text to Speech and Speech to Speech.
- 1Select Voice Design from the mode selector.
- 2Write a Voice Description (minimum 10 characters) — describe gender, age, accent, tone, and delivery style. Example: 'Young American female, warm and energetic, perfect for lifestyle content'.
- 3Write or edit the Sample Text (minimum 100 characters) — this is what each preview will say aloud.
- 4Click Generate Voice Previews. Three previews are generated (free).
- 5Click the play button on each preview to listen. Click a preview card to select it.
- 6Enter a name for the voice and click Save to My Voices.
- 7The saved voice immediately appears in the Text to Speech and Speech to Speech voice dropdowns.
| Plan | Custom Voice Slots |
|---|---|
| Starter | 2 saved voices |
| Pro | 5 saved voices |
| Business | 10 saved voices |
To free up a slot, go to the Voice Design tab → My Voices panel on the right → click the trash icon next to the voice you want to remove.
Output format
All audio is generated and delivered as MP3 at 44,100 Hz / 128 kbps — compatible with every major platform, video editor, and social media tool.