I-isiSwahili yokudlulisa

Bhala isiSwahili umsindo nevidiyo ube ngumbhalo nge-AI. Ishesha, ilungile, futhi imahhala.

Indlela isebenza ngayo

  1. Iya ku- Umshicileli we-Free.ai
  2. Layisha phezulu ihele lakho le-isiSwahili lomsindo noma levidiyo
  3. I-AI yethu ithola ngokuzenzakalela i-isiSwahili futhi ibhalela
  4. Layisha phezulu isingeniso sakho njengesihloko noma isingeniso se-SRT

isiSwahili Izici zokudlulisa

  • Isebenza nge-faster-whisper (MIT licensed)
  • Ukuthola ulwimi isiSwahili ngokuzenzakalela
  • Insiza MP3, WAV, MP4, M4A, FLAC, nezinye eziningi
  • Isikhathi sokufaka kanye nokukhishwa kwesihloko esingezansi (SRT)
  • Akukho mkhawulo wobukhulu befayela kuma-plans akhokhelwayo
  • Imfihlo nokuphepha -- amafayela asuswa ngemuva kokusebenza

Iminingwane yesilimi

I-LanguageisiSwahili
Ikhowudi ye-ISOsw
Imodeli ye-AIi-faster-whisper
IntengoIkhululekile

Izilimi Eziningi

Bona zonke izilimi

Imibuzo ebuzwa kaningi

isiSwahili iyisilimi esincane-somthombo se-Whisper — i-large-v3-turbo ihlala ngaphezu kwe-25% yezinga lephutha legama, ngamanye amaxesha liphezulu kakhulu. I-transcript isebenza ukukhangela nokwenza i-gist kodwa akufanele ithathwe njengeshicilelwa-kulungile. Uma i-engine esezingeni eliphakeme itholakala ku-isiSwahili siyibeka ngokuzenzakalela.(I-Tier D, over 25% word error rate kusethingi se-benchmark — sishicilela ama-tiers we-WER athembekile ngaphezu kokuphikisana nokumaketha.)

Yebo — isiSwahili ukudluliswa kuqala kusuka ku-token pool yakho yamahhala yansuku zonke. Umsindo ubiza ama-token angama-50 ngomzuzu, ngakho-ke i-pool yansuku zonke engaziwa ifaka amahora ambalwa we-audio ngosuku. Ama-akhawunti abhalisiwe athola i-pool enkulu kanye nama-token angama-10,000 wokubhalisa. Phakathi kwalokhu, i-$1 ithenga ama-token angama-750,000 (amahora angama-250 e-audio).

isiSwahili izixhumanisi zibuyiselwa ku UTF-8 ejwayelekile nge-ortography ejwayelekile yesilimi.

MP3, WAV, M4A, FLAC, OGG, OPUS, ne WEBM zivunyelwe ngokuqondile. Ngevidiyo (MP4, MOV, MKV) sikhipha umsindo we-server-side ngaphambi kokuthunyelwa ku-Whisper — awudingi ukushintsha noma yini ngokwakho. Ipayipi elifanayo ngaphandle komthombo we-language, kufaka phakathi i-isiSwahili.

Ukufaka okungenagama kufinyelela kuma-500 MB ngefayela ngalinye. Ama-akhawunti abhalisiwe afinyelela ku-2 GB. Ukuphela kwesikhathi akuyona umkhawulo onzima - amafayela ade ahlukaniswa ngokuzenzakalela (amafasitela emizuzu engu-30 ahlukaniswe) futhi aphinde ahlukaniswe ibe yi-transcript eyodwa nesikhathi esiqhubekayo. Ukurekhodwa kwehora eliningi isiSwahili (amapodcasts, izifundo ezigcwele, izinhlanganiso) kusebenza kahle.

Yebo — ukudweba umsindo komsindo kusetshenzisiwe ngokuzenzakalela kuwo wonke ama-isiSwahili transcript. I-output ihlukaniswe njenge-Speaker 1 / Speaker 2 / Speaker 3 nge-timestamps, ngakho-ke izingqungquthela, izingqungquthela zepaneli, nezingqungquthela zeqembu eliningi zibuyela emuva zinikezwe i-label. Ukudweba umsindo kusebenza ngemodeli ehlukile futhi kusebenza ngokufanayo kuwo wonke ama-languages esiwaxhasayo.

Yebo — chofoza i-URL ku /transcribe/youtube/ ye-YouTube noma /transcribe/podcast/ ye-podcast feeds (Apple, Spotify, RSS). Silanda umsindo, siwuqhube nge-Whisper nge-language=sw, futhi sibuyisele i-transcript nge-timestamps ne-speaker labels. I-isiSwahili ejwayelekile: I-WhatsApp voice notes, i-YouTube vlogs, nevidiyo encane yizinto ezivame kakhulu isiSwahili workloads — chofoza i-URL ku /transcribe/youtube/ noma ulayishe umsindo ngqo.

I-Whisper ibiza cishe ama-token angama-50 ngomzuzu we-audio, ngakho-ke ukurekhodwa kwehora elinye kubiza ama-token angama-3,000. I-$1 ithenga ama-token angama-750,000, okusebenza cishe amahora angama-250 we-audio ngedola. Abaningi abasebenzisayo abachithanga lutho — i-pool yamahhala yosuku lonke ifaka ama-clip aphansi, ama-notes omsindo, nama-podcasts afanayo.

Yebo — zombili isigaba-sezinga (noma yikuphi ~10-30 imizuzwana) kanye negama-level timestamps zikhona. Igama-level yiphutha le-VTT/SRT subtitle export ngakho ama-captions asynchronize line-by-line. Kwi-API hlela timestamps="word" kwi-body yesicelo. isiSwahili izixhumanisi zibuyiselwa ku UTF-8 ejwayelekile nge-ortography ejwayelekile yesilimi.

Yebo. POST umsindo (ingxenye/ifomu-data, igama lendawo "ihele") ku /v1/transcribe/ nge lingu=sw — noma ushiye i parameter yesilimi ukuze i Whisper ikwazi ukukhomba ngokuzenzakalela. Ibuyisela i JSON nge lingu, amasegmenti, ama-timestamps, nama-speaker labels. Umbiko ophelele kanye ne-SDK snippets ku /api/.

Yebo — uma ukuguqulelwa kuqediwe, chofoza guqula noma chofoza umbhalo ku /guqula/. isiSwahili ixhumana nanoma iyiphi enye ulwimi esixhasayo (200+). Usuku lwengxoxo lidlulisa ukuguqulelwa /summarize/; ukuguqulelwa lithunyelwe ku /voice/tts/ ukuze kunikezwe umsindo kulimi oluzosetshenziswa.

Whisper's noise training helps less at this tier — the bottleneck is the amount of isiSwahili audio Whisper saw during training, not noise. Clean studio audio still beats noise audio, but neither will reach the accuracy you would get on a high-source language.Uma i-transcript ibuyela ingasebenzi, thumela i-imeyili ku contact@free.ai ngefayela — sizobuyisela imali ye-token futhi sibheke ukuthi ngabe i-engine eyahlukileyo iphatha umsindo wakho kahle.

Uthanda i-Free.ai? Ngisho nabahlobo bakho!

Linganisa lelikhasi