Create
Lip Sync
6 creditsper second (max 35 s)
Lip Sync animates a static portrait photo to match a provided audio clip, producing a photorealistic talking-head video — no camera required.
Lip Sync requires a paid plan (Starter, Pro, or Business). Free accounts will see an upgrade prompt.
How to use Lip Sync
- 1Navigate to Create → Lip Sync in the sidebar.
- 2Upload a person photo: choose a saved avatar from your Avatars library, or upload a new image (JPG, PNG, WEBP, max 4 MB). The photo should show a clear, forward-facing face.
- 3Upload an audio file (MP3 or WAV, max 4 MB). The audio must be 35 seconds or shorter.
- 4The page automatically detects the audio duration and shows the credit cost (duration in seconds × 6).
- 5Optionally add a Style Prompt — a short description to guide the generation style (e.g. "a person speaking naturally with subtle expressions").
- 6If your audio is longer than 35 seconds, trim it before uploading — the platform enforces a hard 35-second cap.
- 7Click Generate. The job is queued and typically completes in 30–90 seconds.
- 8The output video appears in the results panel and is saved to your Library.
Credit cost example
| Audio Duration | Credit Cost |
|---|---|
| 5 s | 30 credits |
| 10 s | 60 credits |
| 20 s | 120 credits |
| 35 s (max) | 210 credits |
Tips
- Use a clean, well-lit headshot with a neutral expression for the most natural results
- Avoid photos with heavy shadows across the face or extreme angles
- Generate voiceover using the Audio Studio tool, download it, then upload it here
- Keep clips concise — 5–15 seconds works best for short-form social content
- The style prompt is optional but can improve expressiveness — try "speaking naturally with subtle facial expressions"
Lip Sync also works as a node inside the Storyboard canvas. Connect an Image Gen node to the portrait slot, upload audio directly in the node, and connect the video output to a Video Combiner.