Bark
Free.ai (self-hosted)
·
tts
·
~500 tokens per clip
Bark — MIT licensed text-to-audio model that generates speech, music, sound effects, and non-speech sounds (laughter, sighs) in the same clip.
Use via API
curl -X POST https://api.free.ai/v1/tts/ \
-H "Authorization: Bearer sk-free-..." \
-H "Content-Type: application/json" \
-d '{"model":"bark","text":"hello world"}'
API Documentation
Get API Key
Frequently Asked Questions
Bark supports a wide range of languages. The exact list depends on the engine; the form on this page accepts any text and the engine will render in its supported languages. See /voice/ for the full multi-engine picker if you need a specific language.
Most engines render neutral-American English by default and a region-appropriate accent for non-English languages. Premium engines may expose accent variants — paste a sample to compare.
SSML support varies by engine. Pause, prosody, and emphasis tags are honored on most premium engines and on a few self-hosted ones. Plain text always works — no markup required.
Streaming TTS is available on premium engines via the /v1/tts/ API endpoint with stream=true. The web UI on this page returns the full clip once rendering finishes.
Bark runs on our own GPUs. Generation draws from your daily free pool first. Once depleted, paid tokens start at $5 → 200,000 tokens. Roughly ~5 tokens per character, minimum 100 per clip.
Up to 5,000 characters per request on the web UI. For longer pieces (audiobooks, full chapters), use /voice/audiobook/ which chunks and stitches automatically, or call the API in a loop.
Yes — POST a list of strings to /v1/tts/batch/, or use the workspace UI at /workspace/ to chain TTS into a longer pipeline (e.g., translate → speak → stitch).
Yes — POST text to /v1/tts/ with model="Bark" (or the slug on this page). Returns WAV or MP3. See /api/ for full reference + SDK snippets.
This page is text-to-speech, not voice cloning — the voice is the engine's default. For voice cloning (uploading a reference audio), see /voice/clone/, which requires you to either own the voice rights or have explicit written consent.
Self-hosted engines run on Free.ai-owned GPUs; nothing leaves our servers. Premium engines pass text to upstream model providers under our DPA. We do not train on your inputs and do not sell data.
Yes — Free.ai grants commercial use of generated audio. The engine's underlying license (Apache 2.0, MIT, or vendor terms) is shown above and on the model reference page; in practice this means voiceovers, ads, podcasts, and apps are all in-scope.
Yes — failed jobs auto-refund to the source (daily pool or paid tokens). If a refund does not show up the same day, email contact@free.ai.