Report Bug / Feature Request

AI Video Dubbing Studio

Model:
Upload a talking-head clip and get it dubbed into another language with lip-synced mouth movement. Whisper transcribes, MadLAD translates, Kokoro speaks in 174 voices across 37 languages, and Sync Lipsync v2 re-renders the mouth. 99% of clips auto-detect the source language.

Drag a video here or click to upload

MP4, MOV, WebM up to 100MB · single-speaker talking-head works best

Whisper detects the source language on 99% of clips. Override only if auto-detect guesses wrong.
Click Preview to hear the voice speak a short phrase in your target language before you dub the whole clip.
Useful if the video has music or sound FX you want to preserve underneath the new voice. Off = clean single-voice dub.
Token estimate for your clip
Upload a video to see the exact cost for your clip.
Dubbing pipeline
Original transcript
Translated to
Muter-muter

Where AI video dubbing pays for itself

Localize YouTube channels

Turn one English video into Spanish, Portuguese, and Hindi versions overnight. Audio-track swap on YouTube lets a single upload serve 3× the audience with lip-matched mouth movement.

Global ad creative

Shoot one ad, dub into 20 languages for a week-long A/B test. Beats paying a voice-over studio $500/minute per language.

E-learning + corporate training

Compliance, onboarding, and product-training videos that need a dozen languages without a studio budget. Same course, every market.

How the dubbing pipeline works

Step 1

Transcribe (Whisper large-v3)

The video's audio is extracted and transcribed with word-level timing. Source language is auto-detected with 99% accuracy.

Step 2

Translate (MadLAD-400)

The transcript is translated into the target language with a 3B-parameter model tuned for natural spoken phrasing, not literal word-for-word.

Step 3

Speak (Kokoro — 174 voices)

A natural voice in the target language reads the translation. 174 voices across 37 languages — pick one and hear a preview first.

Step 4

Lip-sync (Sync Lipsync v2)

The mouth is re-rendered frame-by-frame to match the new audio. State-of-the-art for single-speaker forward-facing shots.

Why not Rask, Papercup, or HeyGen?

Rask charges $24/mo for 100 minutes of output and caps at 130 source languages. Papercup is enterprise-only (call sales, expect 4-figure bills). HeyGen's dubbing tier starts at $29/mo with a 5-minute quota. This tool uses the same pipeline components — Whisper for STT, MadLAD for translation, Kokoro for TTS, Sync Lipsync v2 for mouth re-rendering — with no subscription, no watermark, no monthly quota. You pay tokens from the pool you already have.

Advanced options
Hasil
Tokens running low. Get More Tokens
Want better results? Premium models (GPT-5, Claude, Gemini) deliver higher quality. View Plans

❤️ Love Free.ai? Tell your friends!

Sign up to get a referral link and earn 25,000 tokens per friend.

Ingkang langkung? Sign up free for 10,000 tokens
Sign Up Free

Ngolah panjalukmu...

Dub any video into 20+ languages with synchronized lip movement. Whisper transcribes, MadLAD translates, Kokoro speaks, Sync Lipsync v2 matches the mouth.

Cara Nggunakake AI Video Dubbing Studio

1
Ngetik inputmu

Ngetik teks, ngundhuh file, utawa nggambarake apa sing sampeyan karep. Ora ana akun sing dibutuhaké.

2
Klik kanggo nyipta

Ing jaman saiki, algoritma iki bisa digunakaké kanggo nganalisa data kanthi luwih apik.

3
Muter & bagéan

Muter, salinan, utawa share asil sampeyan. Free kanggo pribadi lan komersial.

Use this tool via API

Automate this tool from your own code. OpenAI-compatible REST endpoint, Bearer-token auth, no extra SDK required. Token costs match the web interface.

curl -X POST https://api.free.ai/v1/video/generate/ \
  -H "Authorization: Bearer sk-free-..." \
  -H "Content-Type: application/json" \
  -d '{"prompt": "A cat playing piano", "duration": 4}'

AI Video Dubbing Studio — FAQ

Upload a video, pick a target language, and get back the same video dubbed into that language with the speaker's lips re-synchronized to match the new audio. Great for turning English YouTube content into Spanish, French, Chinese, etc.

Four steps run server-side in sequence: (1) Whisper transcribes the original audio, (2) MadLAD translates the transcript to your target language, (3) Kokoro generates natural speech in that language, (4) Sync Lipsync v2 remaps the speaker's mouth to match the new voice. All done in one request — no juggling tools yourself.

The dropdown covers 20 top-demand languages (Spanish, French, German, Portuguese, Italian, Chinese, Japanese, Korean, Arabic, Hindi, Turkish, Russian, Dutch, Polish, Vietnamese, Indonesian, Thai, Hebrew, Swedish, English). MadLAD technically supports 450+ — ping us if you need others.

Dubbing uses paid tokens only (~100,000 per clip). Sync Lipsync v2 is the expensive step — the first three are free self-hosted. Sign-up bonus credits do not unlock this tool.

Clips under 30 seconds dub in about 1–3 minutes. Longer videos take proportionally longer. Hard cap 100 MB upload. For feature-length work, split into scenes and dub each.

No — Kokoro uses one of 174 built-in voices (37 languages), not a cloned version of the original speaker. For identity-preserving voice cloning you'd need our separate /voice/clone/ tool plus a custom pipeline.

Sync Lipsync v2 is state-of-the-art for single-speaker forward-facing shots. Multi-speaker scenes or profile-view clips can drift. Best results come from close-up talking-head footage.

The simple picker offers Auto / Male / Female. For fine-grained voice selection, use /voice/tts/ first to preview and copy the voice ID, then we can wire that through. Coming soon in the UI.

No. The uploaded video is deleted within minutes of processing. The output is kept on our CDN for 24h (7d for paid users) at the share link.

Yes — for that pure lip-sync workflow (your video + your pre-recorded audio), use the underlying /v1/image/edit/ or a custom endpoint. Dubbing combines all four steps automatically.

Use /transcribe/ for subtitle files (SRT/VTT), or /translate/subtitle/ to translate an existing SRT. Dubbing replaces the audio; subtitles overlay text — different outputs.

Yes — POST multipart video to /v1/video/dubbing/ with target_lang. Returns {output_url, transcript, translated_text}. See /api/ for docs.

Sign up free for 10,000 tokens

Akun

Ora perlu kertu kredit

How would you rate this tool?

Like this tool? Share it!