Fal Speech-to-Text
Free.ai
·
stt
·
~500 tokens per minute
Fal Speech-to-Text se a Modèl pale-a-tèks. Itilize nan modèl ekstèn — ~500 tokens pou chak minit (50% markup sou pri alantou).
Itilize via API
OpenAI-compatible REST API. Generate a key and call this model in seconds.
curl -X POST https://api.free.ai/v1/stt/ \
-H "Authorization: Bearer sk-free-..." \
-H "Content-Type: application/json" \
-d '{"model":"premium/speech-to-text","audio_url":"https://..."}'
Dokimantasyon API
Obtenn kle API
Kesyon ki poze souvan
Fal Speech-to-Text transkri son ki pale nan tèks. Upload yon MP3, WAV, M4A, oswa videyo dosye ak Fal Speech-to-Text retounen transkript konplè plus optional SRT / VTT sous-titres ak timestamps.
Fal Speech-to-Text ka sèvi ak dè dizèn de lang — Whisper-familie modèl kouvri 90+, Parakeet kouvri ~25, lòt varye. Chofe "deteksyon otomatik" oswa endike lang pou pi bon presizyon.
Word-erè pousantaj se 5-10% sou son an Angle pwòp, 10-20% sou son an briyan oswa accented. gwo varyasyon nan menm achitekti fè pi byen sou ka difisil - chwazi pi gwo lè son an se rèd.
Wi - chak segment gen ladann kòmanse/fini timestamps. Ekspòtasyon kòm SRT oswa VTT ak tan mape dwat sou ou vidéo.
Fal Speech-to-Text se yon motè transkriptyon prim. Avèk ~500-1,500 tokens pou chak minit odyo. $1 = 750,000 tokens.
MP3, WAV, M4A, FLAC, OGG, plus videyo (MP4, MOV, WebM) — nou ekstrè son an. Max 500 MB pou chak upload. Pi long dosye? Divizyon ak /audio/cut/ oswa itilize /v1/stt/batch/.
Diarization pale se yon pase separe — toggle "diarize" sou /transcribe/. Fal Speech-to-Text jere transkriptyon an; diarization étiquettes chak pati ak pale 1 / pale 2 / elatriye.
Wi — /batch/ aksepte yon dosye ki gen dosye son. Chak transkripte fini nan /account/?tab=history ak non dosye orijinal la. Pou konsève arbr-dosye a, itilize API a.
Wi — POST ou son an nan /v1/stt/transcribe/ ak model="Fal Speech-to-Text". Retounen JSON ak tèks + segments + word-level timestamps. /api/ gen referans konplè.
Modèl ki òganize tèt yo kenbe son sou GPU nou yo; premium pase atravè ak yon DPA. Son an se efase apre fenèt pataje a (24h anon, 7d ouvri sesyon). Nou pa antrene sou enpòte ou yo.
Yes — Free.ai grants commercial use of transcripts. You need rights to the audio you uploaded (your own recording, licensed material, or content with consent).
Faktori tan reyèl la se apeprè 0.05–0.2× — yon podcast 60-minute transkri nan 3–12 minit. Modèles Premium yo souvan fini pi vit.