Fal Speech-to-Text

Free.ai · stt · ~500 tokens per minute

Mete yon dosye son oswa videyo, oswa kole yon URL anba a

~500 tokens per minute
Li kouri gratis sou GPU nou yo. Mete ajou pou Fal Speech-to-Text →

Fal Speech-to-Text se a Modèl pale-a-tèks. Itilize nan modèl ekstèn — ~500 tokens pou chak minit (50% markup sou pri alantou).

Itilize via API

OpenAI-compatible REST API. Generate a key and call this model in seconds.

curl -X POST https://api.free.ai/v1/stt/ \
  -H "Authorization: Bearer sk-free-..." \
  -H "Content-Type: application/json" \
  -d '{"model":"premium/speech-to-text","audio_url":"https://..."}'
Dokimantasyon API Obtenn kle API

Kesyon ki poze souvan

Fal Speech-to-Text transkri son ki pale nan tèks. Upload yon MP3, WAV, M4A, oswa videyo dosye ak Fal Speech-to-Text retounen transkript konplè plus optional SRT / VTT sous-titres ak timestamps.

Fal Speech-to-Text ka sèvi ak dè dizèn de lang — Whisper-familie modèl kouvri 90+, Parakeet kouvri ~25, lòt varye. Chofe "deteksyon otomatik" oswa endike lang pou pi bon presizyon.

Word-erè pousantaj se 5-10% sou son an Angle pwòp, 10-20% sou son an briyan oswa accented. gwo varyasyon nan menm achitekti fè pi byen sou ka difisil - chwazi pi gwo lè son an se rèd.

Wi - chak segment gen ladann kòmanse/fini timestamps. Ekspòtasyon kòm SRT oswa VTT ak tan mape dwat sou ou vidéo.

Fal Speech-to-Text se yon motè transkriptyon prim. Avèk ~500-1,500 tokens pou chak minit odyo. $1 = 750,000 tokens.

MP3, WAV, M4A, FLAC, OGG, plus videyo (MP4, MOV, WebM) — nou ekstrè son an. Max 500 MB pou chak upload. Pi long dosye? Divizyon ak /audio/cut/ oswa itilize /v1/stt/batch/.

Diarization pale se yon pase separe — toggle "diarize" sou /transcribe/. Fal Speech-to-Text jere transkriptyon an; diarization étiquettes chak pati ak pale 1 / pale 2 / elatriye.

Wi — /batch/ aksepte yon dosye ki gen dosye son. Chak transkripte fini nan /account/?tab=history ak non dosye orijinal la. Pou konsève arbr-dosye a, itilize API a.

Wi — POST ou son an nan /v1/stt/transcribe/ ak model="Fal Speech-to-Text". Retounen JSON ak tèks + segments + word-level timestamps. /api/ gen referans konplè.

Modèl ki òganize tèt yo kenbe son sou GPU nou yo; premium pase atravè ak yon DPA. Son an se efase apre fenèt pataje a (24h anon, 7d ouvri sesyon). Nou pa antrene sou enpòte ou yo.

Yes — Free.ai grants commercial use of transcripts. You need rights to the audio you uploaded (your own recording, licensed material, or content with consent).

Faktori tan reyèl la se apeprè 0.05–0.2× — yon podcast 60-minute transkri nan 3–12 minit. Modèles Premium yo souvan fini pi vit.

Love Free.ai? Di zanmi ou yo!

Ranje paj sa a