Audio (Text-to-Speech) — StandIn Labs Docs

StandIn LabsCreateAudio Studio

Create

Audio Studio

Audio Studio combines three tools in one page: Text to Speech, Speech to Speech, and Voice Design. Switch between modes using the pill buttons at the top of the left panel.

Audio Studio (Text to Speech, Speech to Speech, and Voice Design) requires a paid plan (Starter and above).

Mode 1 — Text to Speech

1 creditper 100 characters

Convert any written script into natural-sounding speech using 15+ multilingual AI voices.

1Navigate to Create → Audio in the sidebar and select Text to Speech.
2Type or paste your script (minimum 3 characters, maximum 5,000 characters). The credit counter updates in real time.
3Select a voice from the dropdown — stock voices show gender and accent. Your saved custom voices appear under My Voices.
4Click the play icon next to the dropdown to preview a stock voice before committing.
5Click Generate Audio. The job appears in the queue on the right.
6Once complete, the MP3 plays inline. Click Download MP3 to save the file.

Generate TTS audio, then feed it into a Lip Sync job to produce a talking-head video from a single portrait photo — no camera needed.

Mode 2 — Speech to Speech

15 creditsper conversion

Upload any audio recording and convert the voice inside it to a completely different AI voice — same timing, same delivery, new voice identity.

1Select Speech to Speech from the mode selector.
2Upload an audio file (MP3, WAV, M4A, or WebM). Maximum duration is 3 minutes.
3Select a Target Voice — choose any stock voice or one of your saved custom voices.
4Click Convert Voice (15 credits). The job is queued and typically completes in 30–60 seconds.
5Once complete, the converted MP3 plays inline and can be downloaded.

Speech to Speech is charged at a flat 15 credits regardless of audio length (up to the 3-minute cap). Credits are refunded automatically if the conversion fails.

Mode 3 — Voice Design

Freeto generate previews

Describe a voice in plain text and AI generates three unique voice previews for you to audition. Save your favourite as a custom voice and use it in Text to Speech and Speech to Speech.

1Select Voice Design from the mode selector.
2Write a Voice Description (minimum 10 characters) — describe gender, age, accent, tone, and delivery style. Example: 'Young American female, warm and energetic, perfect for lifestyle content'.
3Write or edit the Sample Text (minimum 100 characters) — this is what each preview will say aloud.
4Click Generate Voice Previews. Three previews are generated (free).
5Click the play button on each preview to listen. Click a preview card to select it.
6Enter a name for the voice and click Save to My Voices.
7The saved voice immediately appears in the Text to Speech and Speech to Speech voice dropdowns.

Plan	Custom Voice Slots
Free	0 saved voices (previews only)
Starter	2 saved voices
Pro	5 saved voices
Business	10 saved voices

To free up a slot, go to the Voice Design tab → My Voices panel on the right → click the trash icon next to the voice you want to remove.

Output format

All audio is generated and delivered as MP3 at 44,100 Hz / 128 kbps — compatible with every major platform, video editor, and social media tool.