I-isiShona yokudlulisa
Bhala isiShona umsindo nevidiyo ube ngumbhalo nge-AI. Ishesha, ilungile, futhi imahhala.
Indlela isebenza ngayo
- Iya ku- Umshicileli we-Free.ai
- Layisha phezulu ihele lakho le-isiShona lomsindo noma levidiyo
- I-AI yethu ithola ngokuzenzakalela i-isiShona futhi ibhalela
- Layisha phezulu isingeniso sakho njengesihloko noma isingeniso se-SRT
isiShona Izici zokudlulisa
- ✓Isebenza nge-faster-whisper (MIT licensed)
- ✓Ukuthola ulwimi isiShona ngokuzenzakalela
- ✓Insiza MP3, WAV, MP4, M4A, FLAC, nezinye eziningi
- ✓Isikhathi sokufaka kanye nokukhishwa kwesihloko esingezansi (SRT)
- ✓Akukho mkhawulo wobukhulu befayela kuma-plans akhokhelwayo
- ✓Imfihlo nokuphepha -- amafayela asuswa ngemuva kokusebenza
Iminingwane yesilimi
| I-Language | isiShona |
| Ikhowudi ye-ISO | sn |
| Imodeli ye-AI | i-faster-whisper |
| Intengo | Ikhululekile |
Izilimi Eziningi
Bona zonke izilimiImibuzo ebuzwa kaningi
isiShona iyisilimi esincane-somthombo se-Whisper — i-large-v3-turbo ihlala ngaphezu kwe-25% yezinga lephutha legama, ngamanye amaxesha liphezulu kakhulu. I-transcript isebenza ukukhangela nokwenza i-gist kodwa akufanele ithathwe njengeshicilelwa-kulungile. Uma i-engine esezingeni eliphakeme itholakala ku-isiShona siyibeka ngokuzenzakalela.(I-Tier D, over 25% word error rate kusethingi se-benchmark — sishicilela ama-tiers we-WER athembekile ngaphezu kokuphikisana nokumaketha.)
Yebo — isiShona ukudluliswa kuqala kusuka ku-token pool yakho yamahhala yansuku zonke. Umsindo ubiza ama-token angama-50 ngomzuzu, ngakho-ke i-pool yansuku zonke engaziwa ifaka amahora ambalwa we-audio ngosuku. Ama-akhawunti abhalisiwe athola i-pool enkulu kanye nama-token angama-10,000 wokubhalisa. Phakathi kwalokhu, i-$1 ithenga ama-token angama-750,000 (amahora angama-250 e-audio).
isiShona izixhumanisi zibuyiselwa ku UTF-8 ejwayelekile nge-ortography ejwayelekile yesilimi.
MP3, WAV, M4A, FLAC, OGG, OPUS, ne WEBM zivunyelwe ngokuqondile. Ngevidiyo (MP4, MOV, MKV) sikhipha umsindo we-server-side ngaphambi kokuthunyelwa ku-Whisper — awudingi ukushintsha noma yini ngokwakho. Ipayipi elifanayo ngaphandle komthombo we-language, kufaka phakathi i-isiShona.
Ukufaka okungenagama kufinyelela kuma-500 MB ngefayela ngalinye. Ama-akhawunti abhalisiwe afinyelela ku-2 GB. Ukuphela kwesikhathi akuyona umkhawulo onzima - amafayela ade ahlukaniswa ngokuzenzakalela (amafasitela emizuzu engu-30 ahlukaniswe) futhi aphinde ahlukaniswe ibe yi-transcript eyodwa nesikhathi esiqhubekayo. Ukurekhodwa kwehora eliningi isiShona (amapodcasts, izifundo ezigcwele, izinhlanganiso) kusebenza kahle.
Yebo — ukudweba umsindo komsindo kusetshenzisiwe ngokuzenzakalela kuwo wonke ama-isiShona transcript. I-output ihlukaniswe njenge-Speaker 1 / Speaker 2 / Speaker 3 nge-timestamps, ngakho-ke izingqungquthela, izingqungquthela zepaneli, nezingqungquthela zeqembu eliningi zibuyela emuva zinikezwe i-label. Ukudweba umsindo kusebenza ngemodeli ehlukile futhi kusebenza ngokufanayo kuwo wonke ama-languages esiwaxhasayo.
Yebo — chofoza i-URL ku /transcribe/youtube/ ye-YouTube noma /transcribe/podcast/ ye-podcast feeds (Apple, Spotify, RSS). Silanda umsindo, siwuqhube nge-Whisper nge-language=sn, futhi sibuyisele i-transcript nge-timestamps ne-speaker labels. I-isiShona ejwayelekile: izifundo, izingqungquthela, izimemo zomsindo, kanye ne-YouTube isiShona zonke zisebenza — chofoza i-URL /transcribe/youtube/ noma ulayishe ifayela ngqo.
I-Whisper ibiza cishe ama-token angama-50 ngomzuzu we-audio, ngakho-ke ukurekhodwa kwehora elinye kubiza ama-token angama-3,000. I-$1 ithenga ama-token angama-750,000, okusebenza cishe amahora angama-250 we-audio ngedola. Abaningi abasebenzisayo abachithanga lutho — i-pool yamahhala yosuku lonke ifaka ama-clip aphansi, ama-notes omsindo, nama-podcasts afanayo.
Yebo — zombili isigaba-sezinga (noma yikuphi ~10-30 imizuzwana) kanye negama-level timestamps zikhona. Igama-level yiphutha le-VTT/SRT subtitle export ngakho ama-captions asynchronize line-by-line. Kwi-API hlela timestamps="word" kwi-body yesicelo. isiShona izixhumanisi zibuyiselwa ku UTF-8 ejwayelekile nge-ortography ejwayelekile yesilimi.
Yebo. POST umsindo (ingxenye/ifomu-data, igama lendawo "ihele") ku /v1/transcribe/ nge lingu=sn — noma ushiye i parameter yesilimi ukuze i Whisper ikwazi ukukhomba ngokuzenzakalela. Ibuyisela i JSON nge lingu, amasegmenti, ama-timestamps, nama-speaker labels. Umbiko ophelele kanye ne-SDK snippets ku /api/.
Yebo — uma ukuguqulelwa kuqediwe, chofoza guqula noma chofoza umbhalo ku /guqula/. isiShona ixhumana nanoma iyiphi enye ulwimi esixhasayo (200+). Usuku lwengxoxo lidlulisa ukuguqulelwa /summarize/; ukuguqulelwa lithunyelwe ku /voice/tts/ ukuze kunikezwe umsindo kulimi oluzosetshenziswa.
Whisper's noise training helps less at this tier — the bottleneck is the amount of isiShona audio Whisper saw during training, not noise. Clean studio audio still beats noise audio, but neither will reach the accuracy you would get on a high-source language.Uma i-transcript ibuyela ingasebenzi, thumela i-imeyili ku contact@free.ai ngefayela — sizobuyisela imali ye-token futhi sibheke ukuthi ngabe i-engine eyahlukileyo iphatha umsindo wakho kahle.