AI Voice — Sesame CSM-1B

Kommersiell användning OK 380+ modeller Inget vattenmärke Ingen registrering behövs
Förlaga:
+ GPT-5, Claude, Gemini
TTS-motor Självvärdig Apache 2.0
Sesame CSM-1B — Sesame CSM-1B — Apache 2.0. Conversational Speech Model designed for low-latency, real-time voice. 24 kHz output, sounds best with a short reference-audio context turn. Self-hosted on Free.ai for the /voice/realtime/ tool.
0 tecken - 0 polletter.
Kostnadsskalor med teckenräkning
Skapar tal...

Vad gör Sesame CSM-1B Låter det som?

Sesame CSM-1B — Apache 2.0. Conversational Speech Model designed for low-latency, real-time voice. 24 kHz output, sounds best with a short reference-audio context turn. Self-hosted on Free.ai for the /voice/realtime/ tool.

Prova rutan ovan med: Hej, mitt namn är Sam, och jag läser detta prov för att visa rösten. — Det är den kanoniska TTS demo fras.

När du ska använda Sesame CSM-1B

Ljudböcker

Långformig berättande med konsekvent ton. Klistra in ett kapitel åt gången, ladda ner som WAV eller MP3 och sy externt.

Podcast-intros

Korta öppna stötfångare och ad-reads. Justera hastigheten för energi, format-brytare till MP3 för mindre filer.

IVR + röstbrevlåda

Telefon-system samtal. Studio-kvalitet utdata utan bokning, inspelning, eller NDAs med röst talang.

Tillgänglighet

Lägg till ljud tillsammans med skriftligt innehåll för låg-vision och dyslexiska läsare. Drop-in på någon sida.

Provfraser

"Welcome to the show, today we are exploring the future of AI."
"Your package has arrived. Please retrieve it from the front desk."
"Once upon a time, in a quiet village far away, lived a curious child."
"Press one for sales, two for support, or stay on the line for an agent."
"Breaking news: scientists have discovered a new species of deep-sea fish."
"Thank you for choosing us. We appreciate your business and look forward to serving you again."

Prissättning

Själv-värd på våra GPUs. Generation drar från din dagliga fri pool först; när det tar slut, betalas token förpackningar börjar på $ 5 → 200.000 polletter. Grovt ~5 polletter per karaktär, minst 100 per klipp.

Fullständig modellreferens → · Se alla TTS-röster → · Jämför 2 röster sida vid sida →

Avancerade alternativ
Resultat
Tokens börjar ta slut. Get More Tokens
Want better results? Premiemodeller (GPT-5, Claude, Gemini) deliver higher quality. View Plans

❤️ Love this tool? Share it!

< a href="/signup/" style="color:#16A34A">Registrera dig för att få en referenslänk och tjäna 25 000 polletter per vän.

Vill du ha mer? Sign up free for 10,000 tokens
Registrera dig gratis

Bearbetning av din begäran...

Sesame CSM-1B — Apache 2.0. Conversational Speech Model designed for low-latency, real-time voice. 24 kHz output, sounds best with a short reference-audio …

Hur du använder AI Voice — Sesame CSM-1B

1
Ange din inmatning

Skriv text, ladda upp en fil eller beskriv vad du vill. Inget konto behövs.

2
Klicka på generera

Vår AI behandlar din begäran på några sekunder med hjälp av de bästa open-source modellerna.

3
Ladda ner & resurs

Ladda ner, kopiera eller dela ditt resultat. Gratis för personligt och kommersiellt bruk.

Använd det här verktyget via API

Automatisera detta verktyg från din egen kod. OpenAI-kompatibel REST endpoint, Bearer-token auth, ingen extra SDK krävs. Token kostnader matchar webbgränssnittet.

curl -X POST https://api.free.ai/v1/tts/ \
  -H "Authorization: Bearer sk-free-..." \
  -H "Content-Type: application/json" \
  -d '{"text": "Hello from Free.ai", "voice": "af_heart", "model": "kokoro"}'

AI Voice — Sesame CSM-1B — FAQ

Sesame CSM-1B supports a wide range of languages. The exact list depends on the engine; the form on this page accepts any text and the engine will render in its supported languages. See /voice/ for the full multi-engine picker if you need a specific language.

Most engines render neutral-American English by default and a region-appropriate accent for non-English languages. Premium engines may expose accent variants — paste a sample to compare.

SSML support varies by engine. Pause, prosody, and emphasis tags are honored on most premium engines and on a few self-hosted ones. Plain text always works — no markup required.

Streaming TTS is available on premium engines via the /v1/tts/ API endpoint with stream=true. The web UI on this page returns the full clip once rendering finishes.

Sesame CSM-1B runs on our own GPUs. Generation draws from your daily free pool first. Once depleted, paid tokens start at $5 → 200,000 tokens. Roughly ~5 tokens per character, minimum 100 per clip.

Up to 5,000 characters per request on the web UI. For longer pieces (audiobooks, full chapters), use /voice/audiobook/ which chunks and stitches automatically, or call the API in a loop.

Yes — POST a list of strings to /v1/tts/batch/, or use the workspace UI at /workspace/ to chain TTS into a longer pipeline (e.g., translate → speak → stitch).

Yes — POST text to /v1/tts/ with model="Sesame CSM-1B" (or the slug on this page). Returns WAV or MP3. See /api/ for full reference + SDK snippets.

This page is text-to-speech, not voice cloning — the voice is the engine's default. For voice cloning (uploading a reference audio), see /voice/clone/, which requires you to either own the voice rights or have explicit written consent.

Self-hosted engines run on Free.ai-owned GPUs; nothing leaves our servers. Premium engines pass text to upstream model providers under our DPA. We do not train on your inputs and do not sell data.

Yes — Free.ai grants commercial use of generated audio. The engine's underlying license (Apache 2.0, MIT, or vendor terms) is shown above and on the model reference page; in practice this means voiceovers, ads, podcasts, and apps are all in-scope.

Yes — failed jobs auto-refund to the source (daily pool or paid tokens). If a refund does not show up the same day, email contact@free.ai.

Registrera dig gratis för 10 000 polletter

Skapa gratis konto

Inget kreditkort krävs

Hur skulle du värdera det här verktyget?

4.3/5 from 3 ratings

Love this tool? Share it!