Report Bug / Feature Request

AI Video Dubbing Studio

Model:
Upload a talking-head clip and get it dubbed into another language with lip-synced mouth movement. Whisper transcribes, MadLAD translates, Kokoro speaks in 174 voices across 37 languages, and Sync Lipsync v2 re-renders the mouth. 99% of clips auto-detect the source language.

Drag a video here or click to upload

MP4, MOV, WebM up to 100MB · single-speaker talking-head works best

Whisper detects the source language on 99% of clips. Override only if auto-detect guesses wrong.
Click Preview to hear the voice speak a short phrase in your target language before you dub the whole clip.
Useful if the video has music or sound FX you want to preserve underneath the new voice. Off = clean single-voice dub.
Token estimate for your clip
Upload a video to see the exact cost for your clip.
Dubbing pipeline
Original transcript
Translated to
Muat turun

Where AI video dubbing pays for itself

Localize YouTube channels

Turn one English video into Spanish, Portuguese, and Hindi versions overnight. Audio-track swap on YouTube lets a single upload serve 3× the audience with lip-matched mouth movement.

Global ad creative

Shoot one ad, dub into 20 languages for a week-long A/B test. Beats paying a voice-over studio $500/minute per language.

E-learning + corporate training

Compliance, onboarding, and product-training videos that need a dozen languages without a studio budget. Same course, every market.

How the dubbing pipeline works

Step 1

Transcribe (Whisper large-v3)

The video's audio is extracted and transcribed with word-level timing. Source language is auto-detected with 99% accuracy.

Step 2

Translate (MadLAD-400)

The transcript is translated into the target language with a 3B-parameter model tuned for natural spoken phrasing, not literal word-for-word.

Step 3

Speak (Kokoro — 174 voices)

A natural voice in the target language reads the translation. 174 voices across 37 languages — pick one and hear a preview first.

Step 4

Lip-sync (Sync Lipsync v2)

The mouth is re-rendered frame-by-frame to match the new audio. State-of-the-art for single-speaker forward-facing shots.

Why not Rask, Papercup, or HeyGen?

Rask charges $24/mo for 100 minutes of output and caps at 130 source languages. Papercup is enterprise-only (call sales, expect 4-figure bills). HeyGen's dubbing tier starts at $29/mo with a 5-minute quota. This tool uses the same pipeline components — Whisper for STT, MadLAD for translation, Kokoro for TTS, Sync Lipsync v2 for mouth re-rendering — with no subscription, no watermark, no monthly quota. You pay tokens from the pool you already have.

Advanced options
Hasil
Tokens running low. Get More Tokens
Want better results? Premium models (GPT-5, Claude, Gemini) deliver higher quality. View Plans

❤️ Love this tool? Share it!

Sign up to get a referral link and earn 25,000 tokens per friend.

Nak lagi? Sign up free for 10,000 tokens
Daftar Masuk

Memproses permintaan anda...

Dub any video into 20+ languages with synchronized lip movement. Whisper transcribes, MadLAD translates, Kokoro speaks, Sync Lipsync v2 matches the mouth.

Bagaimana untuk Guna AI Video Dubbing Studio

1
Masukkan input anda

Taip teks, muat naik fail, atau jelaskan apa yang anda mahu. Tiada akaun diperlukan.

2
Klik cipta

AI kami memproses permintaan anda dalam beberapa saat menggunakan model sumber terbuka terbaik.

3
Muat turun & kongsi

Muat turun, salin, atau kongsi hasil anda. Muat turun percuma untuk kegunaan peribadi dan komersial.

Use this tool via API

Automate this tool from your own code. OpenAI-compatible REST endpoint, Bearer-token auth, no extra SDK required. Token costs match the web interface.

curl -X POST https://api.free.ai/v1/video/generate/ \
  -H "Authorization: Bearer sk-free-..." \
  -H "Content-Type: application/json" \
  -d '{"prompt": "A cat playing piano", "duration": 4}'

AI Video Dubbing Studio — FAQ

Upload a video, pick a target language, and get back the same video dubbed into that language with the speaker's lips re-synchronized to match the new audio. Great for turning English YouTube content into Spanish, French, Chinese, etc.

Four steps run server-side in sequence: (1) Whisper transcribes the original audio, (2) MadLAD translates the transcript to your target language, (3) Kokoro generates natural speech in that language, (4) Sync Lipsync v2 remaps the speaker's mouth to match the new voice. All done in one request — no juggling tools yourself.

The dropdown covers 20 top-demand languages (Spanish, French, German, Portuguese, Italian, Chinese, Japanese, Korean, Arabic, Hindi, Turkish, Russian, Dutch, Polish, Vietnamese, Indonesian, Thai, Hebrew, Swedish, English). MadLAD technically supports 450+ — ping us if you need others.

Dubbing uses paid tokens only (~100,000 per clip). Sync Lipsync v2 is the expensive step — the first three are free self-hosted. Sign-up bonus credits do not unlock this tool.

Clips under 30 seconds dub in about 1–3 minutes. Longer videos take proportionally longer. Hard cap 100 MB upload. For feature-length work, split into scenes and dub each.

No — Kokoro uses one of 174 built-in voices (37 languages), not a cloned version of the original speaker. For identity-preserving voice cloning you'd need our separate /voice/clone/ tool plus a custom pipeline.

Sync Lipsync v2 is state-of-the-art for single-speaker forward-facing shots. Multi-speaker scenes or profile-view clips can drift. Best results come from close-up talking-head footage.

The simple picker offers Auto / Male / Female. For fine-grained voice selection, use /voice/tts/ first to preview and copy the voice ID, then we can wire that through. Coming soon in the UI.

No. The uploaded video is deleted within minutes of processing. The output is kept on our CDN for 24h (7d for paid users) at the share link.

Yes — for that pure lip-sync workflow (your video + your pre-recorded audio), use the underlying /v1/image/edit/ or a custom endpoint. Dubbing combines all four steps automatically.

Use /transcribe/ for subtitle files (SRT/VTT), or /translate/subtitle/ to translate an existing SRT. Dubbing replaces the audio; subtitles overlay text — different outputs.

Yes — POST multipart video to /v1/video/dubbing/ with target_lang. Returns {output_url, transcript, translated_text}. See /api/ for docs.

Sign up free for 10,000 tokens

Cipta Akaun Bebas

Tiada kad kredit diperlukan

How would you rate this tool?

Love this tool? Share it!