报告错误/功能要求
Chats

没有之前的聊天次数

Free.ai ~500 tokens/msg
Fal Speech-to-Text

Hi! I'm Fal Speech-to-Text. Ask me anything.

Fal Speech-to-Text requires purchased tokens. 获得当当量 | Sign Up — 10K Free | 使用自由模式替代
~500 tokens/msg 要输入发送
模型详细细节

模型详细细节

提供者 Free.ai
类别类别类别 Misc
成本成本成本成本成本 ~500 tokens/msg

关于

Fal Speech-to-Text is a misc model by Free.ai, available on Free.ai. Each generation costs approximately 54,000 tokens. Compare against our self-hosted models which run free within your daily limit.

通过 API 使用

curl https://api.free.ai/v1/chat/ \
  -H "Authorization: Bearer YOUR_KEY" \
  -d '{"model":"premium/speech-to-text"}'
API 文件

FAQ 常见时( Q)

Free.ai offers Whisper-powered speech to text with excellent accuracy, 99 languages, subtitle export, speaker detection, and live mic capture — completely free.

Upload an audio or video file (MP3, WAV, MP4, M4A), click Transcribe, and get accurate speech to text in seconds. Or record live from your microphone.

Yes. Paste any YouTube URL in the URL tab and the speech to text tool extracts the audio and converts it. Works with Instagram, TikTok, Spotify, and 1,300+ platforms.

Yes. Auto-detect or select from 99 languages. Our speech to text handles accents, background noise, and mixed-language audio well.

Yes. Select multiple audio files at once — each is sent through speech to text with progress tracking and the results are downloadable separately or combined.

Yes. The speech to text API at /api/ is OpenAI-compatible. Upload audio programmatically and receive JSON with the transcript, language, and timestamps.

Yes. Toggle Speaker Detection before uploading and the speech to text output is labelled per speaker (Speaker 1, Speaker 2…). Adds 50% to token cost.

Speech to text accepts files up to 500MB per upload. For multi-hour content, split the audio into chunks first.

Very accurate for clear audio — typically 95%+ word accuracy in English with our Whisper large-v3 backend. Quality depends on audio clarity, accent, and background noise.

Yes. The transcript is fully editable in-place. Fix errors, reformat, and copy/download as TXT, SRT, or VTT.

Yes. Audio is processed on our own GPUs and deleted after speech to text completes. Nothing is stored long-term, shared, or used for training.

Yes. Upload an audio or video file in /chat/ and ask the AI to transcribe it — combine speech to text with follow-up questions and summarization in one workflow.

Love this tool? Share it!

本页利率