StandIn Labs
StandIn LabsCreateLip Sync

Create

Lip Sync

6 creditsper second (max 35 s)

Lip Sync animates a static portrait photo to match a provided audio clip, producing a photorealistic talking-head video — no camera required.

Lip Sync requires a paid plan (Starter, Pro, or Business). Free accounts will see an upgrade prompt.

How to use Lip Sync

  1. 1Navigate to Create → Lip Sync in the sidebar.
  2. 2Upload a person photo: choose a saved avatar from your Avatars library, or upload a new image (JPG, PNG, WEBP, max 4 MB). The photo should show a clear, forward-facing face.
  3. 3Upload an audio file (MP3 or WAV, max 4 MB). The audio must be 35 seconds or shorter.
  4. 4The page automatically detects the audio duration and shows the credit cost (duration in seconds × 6).
  5. 5Optionally add a Style Prompt — a short description to guide the generation style (e.g. "a person speaking naturally with subtle expressions").
  6. 6If your audio is longer than 35 seconds, trim it before uploading — the platform enforces a hard 35-second cap.
  7. 7Click Generate. The job is queued and typically completes in 30–90 seconds.
  8. 8The output video appears in the results panel and is saved to your Library.

Credit cost example

Audio DurationCredit Cost
5 s30 credits
10 s60 credits
20 s120 credits
35 s (max)210 credits

Tips

  • Use a clean, well-lit headshot with a neutral expression for the most natural results
  • Avoid photos with heavy shadows across the face or extreme angles
  • Generate voiceover using the Audio Studio tool, download it, then upload it here
  • Keep clips concise — 5–15 seconds works best for short-form social content
  • The style prompt is optional but can improve expressiveness — try "speaking naturally with subtle facial expressions"
Lip Sync also works as a node inside the Storyboard canvas. Connect an Image Gen node to the portrait slot, upload audio directly in the node, and connect the video output to a Video Combiner.