Wizper (Whisper v3)

Free.ai · stt · ~500 tokens per minute

Upload audio

Drop an audio or video file, or paste a URL below

~500 tokens per minute

Runs free on our GPUs. Upgrade for Wizper (Whisper v3) →

Wizper (Whisper v3) is a speech-to-text model. Routed through external models — ~500 tokens per minute (50% markup over upstream cost).

Use via API

OpenAI-compatible REST API. Generate a key and call this model in seconds.

curl -X POST https://api.free.ai/v1/stt/ \
  -H "Authorization: Bearer sk-free-..." \
  -H "Content-Type: application/json" \
  -d '{"model":"premium/wizper","audio_url":"https://..."}'

API Documentation Get API Key

Similar models

ElevenLabs STT

Fal Speech-to-Text

Browse all models →

Frequently Asked Questions

Wizper (Whisper v3) transcribes spoken audio into text. Upload an MP3, WAV, M4A, or video file and Wizper (Whisper v3) returns the full transcript plus optional SRT/VTT subtitles with timestamps.

Wizper (Whisper v3) handles dozens of languages — Whisper-family models cover 90+, Parakeet covers ~25, others vary. Pick "auto-detect" or specify the language for highest accuracy.

Word-error rate is 5–10% on clean English audio, 10–20% on noisy or accented audio. Large variants of the same architecture do meaningfully better on hard cases — pick larger when the audio is rough.

Yes — every segment includes start/end timestamps. Export as SRT or VTT and the times map straight onto your video.

Wizper (Whisper v3) is a premium transcription engine. About ~500–1,500 tokens per minute of audio. $1 = 750,000 tokens.

MP3, WAV, M4A, FLAC, OGG, plus video (MP4, MOV, WebM) — we extract the audio. Max 500 MB per upload. Longer files? Split with /audio/cut/ or use /v1/stt/batch/.

Speaker diarization is a separate pass — toggle "diarize" on /transcribe/. Wizper (Whisper v3) handles the transcription; diarization labels each segment with Speaker 1 / Speaker 2 / etc.

Yes — /batch/ accepts a folder of audio files. Each transcript lands in /account/?tab=history with the original filename. For folder-tree preservation use the API.

Yes — POST your audio to /v1/stt/transcribe/ with model="Wizper (Whisper v3)". Returns JSON with text + segments + word-level timestamps. /api/ has the full reference.

Self-hosted models keep audio on our GPUs; premium pass through with a DPA. Audio is deleted after the share-window (24h anon, 7d signed-in). We do not train on your inputs.

Yes — Free.ai grants commercial use of transcripts. You need rights to the audio you uploaded (your own recording, licensed material, or content with consent).

Real-time factor is roughly 0.05–0.2× — a 60-minute podcast transcribes in 3–12 minutes. Premium models often finish faster. Use the queue button to close the tab.

Wizper (Whisper v3)

Use via API

Similar models

Frequently Asked Questions

What does Wizper (Whisper v3) do?

How many languages does Wizper (Whisper v3) support?

How accurate is Wizper (Whisper v3)?

Does Wizper (Whisper v3) include timestamps?

How much does Wizper (Whisper v3) cost per minute?

What audio formats can I upload to Wizper (Whisper v3)?

Can Wizper (Whisper v3) identify different speakers?

Can I batch transcribe with Wizper (Whisper v3)?

Is there an API for Wizper (Whisper v3)?

What about privacy when I transcribe with Wizper (Whisper v3)?

Is Wizper (Whisper v3) output safe for commercial use?

How long does Wizper (Whisper v3) take?

Get 10,000 Free Tokens

Wait — 30K free tokens/day!

Want more?