AI Music Video Maker

Model:
Type a song idea, get a finished music video — vocals, instrumental, and synced visuals all in one render. We compose the song, parse it into scenes, generate one video clip per scene, and stitch everything together. Royalty-free MP4.
Describe the vibe, the story, the imagery — we'll write lyrics + structure the song around it.
Have your own lyrics? Paste them here (optional)
Use [Verse 1] / [Chorus] / [Bridge] tags — each tagged section becomes one video scene. If left blank we generate lyrics from your concept.
Estimated cost
~25,000 tokens
~25,000 tokens
Building your music video
0%
Starting...
Composing song
Parsing scenes
Rendering scenes
Stitching + audio
Total render time: 3-7 minutes depending on song length. You can close this tab — we'll save the result to your dashboard.
Your music video
Download MP4

What people make here

TikTok / Reels music drops

30-second hook + visuals = a complete original sound ready to post. Pick 9:16 aspect and you're done.

YouTube lyric / visualizer videos

Drop a 2-minute single onto your channel with cinematic visuals — no After Effects needed.

Personalized gifts

Custom song + video for a wedding, birthday, anniversary, or graduation. Way better than a card.

Brand anthems & ads

Custom jingle with matching video for a launch, campaign, or trade-show loop. Royalty-free, ready to ship.

Songwriter demos

Pitch a song concept to A&R or co-writers with full audio + video mockup before booking studio time.

Indie releases

Drop a full single + video on Spotify, Apple Music, and YouTube the same afternoon — no licensing fees.

How it works

Compose the song

ACE-Step writes lyrics and an instrumental from your concept (or uses your lyrics directly). 30-second to 2-minute outputs, full vocals + band.

Parse into scenes

Each [Verse] / [Chorus] / [Bridge] block becomes one ~5-second visual scene. No tags? We split the song into even chunks.

Render visuals

For each scene, the lyrics become the prompt for a CogVideoX clip in your chosen visual style. We render them sequentially so you can watch progress.

Stitch & mix

ffmpeg concatenates all scenes into one MP4, then mixes the music as the soundtrack. You get a finished video to download or share.

Want just one piece of the workflow?

Song only Lyrics only Video only Make karaoke
Advanced options
Result
Tokens running low. Get More Tokens
Want better results? Premium models (GPT-5, Claude, Gemini) deliver higher quality. View Plans

❤️ Love Free.ai? Tell your friends!

Sign up to get a referral link and earn 25,000 tokens per friend.

Want more? Sign up free for 5K tokens/day + 10K bonus
Sign Up Free

Processing your request...

Type a song idea, get a finished music video — vocals, instrumental, and synced visuals all in one render. Royalty-free MP4 in about 5 minutes.

How to Use AI Music Video Maker

1
Enter your input

Type text, upload a file, or describe what you want. No account needed.

2
Click generate

Our AI processes your request in seconds using the best open-source models.

3
Download & share

Download, copy, or share your result. Free for personal and commercial use.

Use this tool via API

Automate this tool from your own code. OpenAI-compatible REST endpoint, Bearer-token auth, no extra SDK required. Token costs match the web interface.

curl -X POST https://api.free.ai/v1/music/generate/ \
  -H "Authorization: Bearer sk-free-..." \
  -H "Content-Type: application/json" \
  -d '{"prompt": "upbeat synthwave instrumental", "duration": 30}'

AI Music Video Maker — FAQ

Free AI Music Video Maker turns a one-line song idea into a finished music video. We compose the song with vocals + instrumental, parse the lyrics into 1-4 visual scenes, render a video clip per scene, then stitch everything into a single MP4 with the music as the soundtrack. Output is royalty-free and ready to upload.

Three to seven minutes end-to-end. The song takes 60-90 seconds, each video scene takes 60-90 seconds (CogVideoX A100 inference), and the ffmpeg stitch is under 10 seconds. A 60-second song with 2 scenes lands fastest; a 2-minute song with 4 scenes takes the longest.

One scene per ~30 seconds of song, capped at 4 scenes in Phase 1 to keep cost manageable. A 30-second snippet gets 1 scene, a 60-second song gets 2, a 90-second song gets 3, and a 2-minute song gets 4.

No — leave the optional lyrics box blank and ACE-Step will write lyrics from your concept. If you do paste your own [Verse]/[Chorus]/[Bridge]-tagged lyrics, each tagged section becomes one video scene with its lyrics as the visual prompt.

Cinematic (dramatic lighting, film grain), Anime (vibrant cel-shaded), Photorealistic (natural lighting), Abstract / Dreamy (surreal flowing colors), VHS / Retro (80s analog distortion), and Claymation (stop-motion craft). The style is appended to every scene prompt so the look stays consistent across the video.

About 25K tokens for a 60-second 2-scene video, ~45K for a 90-second 3-scene, and ~65K for a 2-minute 4-scene. Breakdown: 5K for the song, ~10K per video scene (CogVideoX), 300 for the ffmpeg stitch. The live cost label updates as you change settings.

Yes — the song (ACE-Step, Apache 2.0), the video (CogVideoX, Apache 2.0), and the final stitched MP4 are all royalty-free. Upload to YouTube, TikTok, Spotify Canvas, Instagram Reels, or your own site without licensing fees. You own the output.

16:9 landscape (YouTube / desktop), 9:16 portrait (TikTok / Reels / Shorts / Stories), and 1:1 square (Instagram feed posts). Pick the aspect that matches your platform — every scene renders at that ratio so the final video is correctly framed.

Yes — after rendering, the result panel includes a bonus link to download the standalone song WAV. Want only the video without the audio? Run our /music/karaoke/ tool on the result, or download the MP4 and mute it locally.

The progress bar turns red on the failed stage and shows the error. Click "Make another" to try again — only the failed stage retries from scratch (we do not persist intermediate clips between runs in Phase 1). For a higher success rate, keep the song under 90 seconds and stick to cinematic or photorealistic styles.

Yes — opt into email notifications above the progress bar and we will send you a link when it is done. Authenticated users also see the result in /account/?tab=history. The render keeps running on our GPUs even if you navigate away.

Not yet as a single endpoint — under the hood the maker calls /v1/music/generate/ace-step/, /v1/video/generate/ multiple times, then /v1/video/concat/ with the audio_url. You can replicate the orchestration yourself by chaining those three. Full docs at /api/.

Sign up free for 10,000 tokens

Create Free Account

No credit card required

How would you rate this tool?

Love Free.ai? Tell your friends!