AI Video Generator AI Video Editor AI Video Enhancer AI Video Dubbing Studio AI Performance Capture (Runway Act-Two) AI Video Translator AI Video Effects — Pikaffects-style AI Video Upscaler More →

AI Motion Capture

Whakama ā-pūnaha OK 380+ tauira Kāore he tohu wai Kāore he kōwhiringa e hiahiatia ana

Upload any video with a person in it — AI tracks 33 body keypoints per frame and gives you a skeleton overlay video plus a JSON of joint positions for every frame. No mocap suit, no markers, no calibration. Single-camera markerless motion capture via MediaPipe.

Upload a video, AI extracts 3D body pose per frame using MediaPipe. Get back a skeleton overlay video plus a per-frame keypoints JSON for animation, sports analysis, or biomechanics. Free, no markers, no mocap suit.

He pēhea te whakamahi AI Motion Capture

Kei roto i tō tou tāuru

Type i te kupu, tuku i tētahi faila, whakaahua rānei i te mea e hiahiatia ana e koe. Kāore he tatau e hiahiatia ana.

Ka tirohia te whakatūnga

Ka tukatuka tātau AI i tō tātau tono i roto i ngā wā kotahi mā te whakamahi i ngā tauira pūtake tūwhera pai rawa.

Whakahua & tiritiri

Whakataki, tārua, tiritiri rānei i tōna hua. Whakatika noa iho mō te whakamahinga whaiaro, hokohoko rānei.

Ka whakamahia tēnei utauta mā te API

Ka whakamātautau tēnei utauta mai i tōtou waehere. Ko te wāhi mutunga o te REST e ōrite ana ki te OpenAI, te mana tohu-tokona, kāore he SDK tāpiri e hiahiatia ana. Ko ngā utu tohu e ōrite ana ki te whakawhitinga whatunga.

Ka taea te whakataki i te papatono Kitenga te kī API

curl -X POST https://api.free.ai/v1/video/generate/ \
  -H "Authorization: Bearer sk-free-..." \
  -H "Content-Type: application/json" \
  -d '{"prompt": "A cat playing piano", "duration": 4}'

E pā ana ki ngā utauta AI wātea

AI Video Generator

AI Video Editor

AI Video Enhancer

AI Video Dubbing Studio

AI Performance Capture (Runway Act-Two)

AI Video Translator

AI Video Effects — Pikaffects-style

AI Video Upscaler

AI Motion Capture — FAQ

Drop in a video with a person in frame and the AI tracks 33 body joints — head, shoulders, elbows, wrists, hips, knees, ankles, plus hands and feet — in every frame. You get back a skeleton overlay video plus a JSON file with the per-frame joint coordinates. No mocap suit, no markers, no calibration step, single camera works fine.

Drive 3D character animations in Blender / Unity / Unreal (re-target to a rigged armature), do sports / dance / martial-arts technique analysis, build form-correction overlays, train ML models on movement data, or just visualize movement patterns over time.

MediaPipe Pose Landmarker (Google, Apache 2.0). It outputs 33 body keypoints per frame in normalized 2D coords + an estimated Z (relative depth from the camera) + per-keypoint visibility scores. It runs entirely on CPU so the GPU stays free for your other generations.

It's 2.5D — true 2D + estimated relative Z from a single camera. Real 3D motion capture needs multiple synchronized cameras for triangulation (the FreeMoCap / OptiTrack / Vicon approach). For TikTok dances, sports analysis, animation reference, or any single-camera workflow, MediaPipe's output is excellent. We'll add a multi-camera tool later for users who need true 3D.

200 tokens per second of input video (floored at 500 tokens). A 10-second clip costs 2,000 tokens; a 60-second clip costs 12,000. Daily-pool free tokens cover a few short clips per day; signed-in users get 5K/day.

Real-time-ish: roughly 30-50 frames per second on our box. A 1-minute 30 fps video processes in 30-60 seconds end-to-end including upload + render. Longer videos take proportionally longer.

MP4, MOV, WebM, AVI, MKV, and most common video formats — anything ffmpeg can decode. Max upload 100 MB. Resolution doesn't matter much; the pose model internally downsamples for speed.

The MediaPipe Pose Landmarker tracks ONE person per frame (the most prominent one). For multi-person tracking we'd need a different model (RTMPose, YOLOv8-pose). If your use case is multi-person, file an idea via /contact/ — happy to add it as a separate tool.

Visible joints are tracked to within ~5-10 px on a 720p frame; occluded joints (hand behind back, foot off-frame) get filled in with low visibility scores so you can filter them out. Smoothing across frames in your downstream pipeline (Kalman / Savitzky-Golay) cleans up the rest.

Not directly from AI Motion Capture today — the JSON is the raw keypoint data. You can convert offline using libraries like `aniposelib` or `pose-format`. We're considering shipping a "Mocap → BVH/FBX" follow-on tool — file an upvote at /contact/ if you want it.

Processed immediately, the keypoints are extracted, then the input video is deleted. The skeleton-overlay output and JSON are kept for the standard share-link expiry (24 h anonymous / 7 d paid). Never used for training. /privacy/ for the full policy.

Yes — POST a multipart `video` file to /v1/video/motion-capture/. Returns {video_url, json_url, duration_s, tokens, share_url}. Bearer auth (sk-free-…) gives you 10,000 tokens/month free. Curl example at /api/.

Ka whakaingoatia te wāteatanga mō ngā tohu 10,000

Ka waihanga tētahi pūkete wātea

Kāore he kāri ā-pūtea e hiahiatia ana

He pēhea te whakawātea i tēnei utauta?

AI Motion Capture

Whakamutunga

He pēhea te whakamahi AI Motion Capture

Kei roto i tō tou tāuru

Ka tirohia te whakatūnga

Whakahua & tiritiri

Ka whakamahia tēnei utauta mā te API

E pā ana ki ngā utauta AI wātea

AI Motion Capture — FAQ

What is AI Motion Capture?

What can I do with the keypoints JSON?

Which model powers it?

Is this real 3D motion capture?

How does the cost work?

How fast is it?

What input formats are supported?

Multiple people in the frame — what happens?

How accurate is the joint position?

Can I export this to BVH / FBX for Blender?

Is the uploaded video stored?

Is there an API?

10,000 ngā tohu wātea

Waihoki - 10K Whakawhiwhinga Whakawhiwhinga!

E hiahiatia ana ētahi atu?