I-i-English yokudlulisa

Bhala i-English umsindo nevidiyo ube ngumbhalo nge-AI. Ishesha, ilungile, futhi imahhala.

Indlela isebenza ngayo

  1. Iya ku- Umshicileli we-Free.ai
  2. Layisha phezulu ihele lakho le-i-English lomsindo noma levidiyo
  3. I-AI yethu ithola ngokuzenzakalela i-i-English futhi ibhalela
  4. Layisha phezulu isingeniso sakho njengesihloko noma isingeniso se-SRT

i-English Izici zokudlulisa

  • Isebenza nge-faster-whisper (MIT licensed)
  • Ukuthola ulwimi i-English ngokuzenzakalela
  • Insiza MP3, WAV, MP4, M4A, FLAC, nezinye eziningi
  • Isikhathi sokufaka kanye nokukhishwa kwesihloko esingezansi (SRT)
  • Akukho mkhawulo wobukhulu befayela kuma-plans akhokhelwayo
  • Imfihlo nokuphepha -- amafayela asuswa ngemuva kokusebenza

Iminingwane yesilimi

I-Languagei-English
Ikhowudi ye-ISOen
Imodeli ye-AIi-faster-whisper
IntengoIkhululekile

Izilimi Eziningi

Bona zonke izilimi

Imibuzo ebuzwa kaningi

I-Whisper large-v3-turbo iwela engxenyeni ephezulu yezinga lokunemba ku-i-English — ngaphansi kwe-7% yezinga lephutha legama ku-benchmarks ejwayelekile. Emisebenzini, lokhu kusho ukuthi umsindo westudio ohlanzekile ubuyela emuva ogcwele, futhi umsindo wokuxoxa ungasetshenziswa ngokulungisa okuncane.(I-Tier A, under 7% word error rate kusethingi se-benchmark — sishicilela ama-tiers we-WER athembekile ngaphezu kokuphikisana nokumaketha.)

Yebo — i-English ukudluliswa kuqala kusuka ku-token pool yakho yamahhala yansuku zonke. Umsindo ubiza ama-token angama-50 ngomzuzu, ngakho-ke i-pool yansuku zonke engaziwa ifaka amahora ambalwa we-audio ngosuku. Ama-akhawunti abhalisiwe athola i-pool enkulu kanye nama-token angama-10,000 wokubhalisa. Phakathi kwalokhu, i-$1 ithenga ama-token angama-750,000 (amahora angama-250 e-audio).

Ukudluliswa kwesiNgisi kuhlanganisa i-US, i-UK, i-Australia, i-India, nezinye izilimi eziyinhloko kwimodeli eyodwa. I-Whisper yaqeqeshwa kuzo zonke futhi ukudluliswa kuvela ngesiNgisi esijwayelekile sokubhala ngaphandle kokukhathazeka ngolimi lomsindo.

MP3, WAV, M4A, FLAC, OGG, OPUS, ne WEBM zivunyelwe ngokuqondile. Ngevidiyo (MP4, MOV, MKV) sikhipha umsindo we-server-side ngaphambi kokuthunyelwa ku-Whisper — awudingi ukushintsha noma yini ngokwakho. Ipayipi elifanayo ngaphandle komthombo we-language, kufaka phakathi i-i-English.

Ukufaka okungenagama kufinyelela kuma-500 MB ngefayela ngalinye. Ama-akhawunti abhalisiwe afinyelela ku-2 GB. Ukuphela kwesikhathi akuyona umkhawulo onzima - amafayela ade ahlukaniswa ngokuzenzakalela (amafasitela emizuzu engu-30 ahlukaniswe) futhi aphinde ahlukaniswe ibe yi-transcript eyodwa nesikhathi esiqhubekayo. Ukurekhodwa kwehora eliningi i-English (amapodcasts, izifundo ezigcwele, izinhlanganiso) kusebenza kahle.

Yebo — ukudweba umsindo komsindo kusetshenzisiwe ngokuzenzakalela kuwo wonke ama-i-English transcript. I-output ihlukaniswe njenge-Speaker 1 / Speaker 2 / Speaker 3 nge-timestamps, ngakho-ke izingqungquthela, izingqungquthela zepaneli, nezingqungquthela zeqembu eliningi zibuyela emuva zinikezwe i-label. Ukudweba umsindo kusebenza ngemodeli ehlukile futhi kusebenza ngokufanayo kuwo wonke ama-languages esiwaxhasayo.

Yebo — chofoza i-URL ku /transcribe/youtube/ ye-YouTube noma /transcribe/podcast/ ye-podcast feeds (Apple, Spotify, RSS). Silanda umsindo, siwuqhube nge-Whisper nge-language=en, futhi sibuyisele i-transcript nge-timestamps ne-speaker labels. I-i-English ejwayelekile: izifundo, izingqungquthela, izimemo zomsindo, kanye ne-YouTube i-English zonke zisebenza — chofoza i-URL /transcribe/youtube/ noma ulayishe ifayela ngqo.

I-Whisper ibiza cishe ama-token angama-50 ngomzuzu we-audio, ngakho-ke ukurekhodwa kwehora elinye kubiza ama-token angama-3,000. I-$1 ithenga ama-token angama-750,000, okusebenza cishe amahora angama-250 we-audio ngedola. Abaningi abasebenzisayo abachithanga lutho — i-pool yamahhala yosuku lonke ifaka ama-clip aphansi, ama-notes omsindo, nama-podcasts afanayo.

Yebo — zombili isigaba-sezinga (noma yikuphi ~10-30 imizuzwana) kanye negama-level timestamps zikhona. Igama-level yiphutha le-VTT/SRT subtitle export ngakho ama-captions asynchronize line-by-line. Kwi-API hlela timestamps="word" kwi-body yesicelo. i-English izixhumanisi zibuyiselwa ku UTF-8 ejwayelekile nge-ortography ejwayelekile yesilimi.

Yebo. POST umsindo (ingxenye/ifomu-data, igama lendawo "ihele") ku /v1/transcribe/ nge lingu=en — noma ushiye i parameter yesilimi ukuze i Whisper ikwazi ukukhomba ngokuzenzakalela. Ibuyisela i JSON nge lingu, amasegmenti, ama-timestamps, nama-speaker labels. Umbiko ophelele kanye ne-SDK snippets ku /api/.

Yebo — uma ukuguqulelwa kuqediwe, chofoza guqula noma chofoza umbhalo ku /guqula/. i-English ixhumana nanoma iyiphi enye ulwimi esixhasayo (200+). Usuku lwengxoxo lidlulisa ukuguqulelwa /summarize/; ukuguqulelwa lithunyelwe ku /voice/tts/ ukuze kunikezwe umsindo kulimi oluzosetshenziswa.

I-Whisper iqeqeshwe ngehora le-680K le-audio yezwe elingokoqobo, ngakho-ke i-i-English transcription inamandla kakhulu ku-background noise, music beds, ne-phone-quality recordings. Ukucisha okunzima noma izikhulumi ezingaphezu kwezingu-100 zizoqhubeka nokulimaza ukuthembeka.Uma i-transcript ibuyela ingasebenzi, thumela i-imeyili ku contact@free.ai ngefayela — sizobuyisela imali ye-token futhi sibheke ukuthi ngabe i-engine eyahlukileyo iphatha umsindo wakho kahle.

Uthanda i-Free.ai? Ngisho nabahlobo bakho!

Linganisa lelikhasi