arXiv PDF Extractor

Ho sebelisoa ka khoebo 380+ li-models Ha ho letšoao la metsi Ha ho hlokahale ho ngolisoa
Mofuta:
+ GPT-5, Claude, Gemini
Tlosa arXiv preprint, journal paper, kapa thesis chapter — AI e e fetola ho lengolo le hloekileng le nang le LaTeX-flavored. Li-equations tsa li-mathematics li lula li le li-equations, li-layout tsa li-colon tse ngata li tlosoa, li-citations li bolokiloe. E sebetsa ka Meta Nougat-base.

Tlatsa lengolo la lipatlisiso la PDF mona kapa tobetsa ho kenya

E na le li-kilometre tse 500 (300 mi) tsa litsela tse kopantsoeng.

Ho bala li-equations + ho tlosa li-columns… ~10 sec/page
Likhetho tse tsoetseng pele
Bo_lemo
Tokens e tlase. Fumana Token e eketsehileng
U batla liphetho tse ntle? Li-models tsa Premium (GPT-5, Claude, Gemini) fana ka boleng bo phahameng. Bona Litlhophiso

❤️ U rata Free.ai? Reka le metsoalle ea hau!

Register ho fumana sehokela sa ho u joetsa le ho fumana li-token tse 25 000 ka motsoalle.

U batla ho feta? Ngola mahala bakeng sa 30K tokens / letsatsi + 10K bonus
Ngola mahala

Ho sebetsana le kopo ea hau...

Tlosa arXiv preprint, fumana tekanyo e hloekileng ea LaTeX-flavored le equation e ngoe e entsoeng kahare. Li-layout tse ngata tsa lebokose li hlophisitsoe, li-references li bolokiloe. Free, AI-powered.

Mokhoa oa ho sebelisa arXiv PDF Extractor

1
Ke eng eo u e kentseng?

Tlatsa mongolo, kenya faele, kapa hlalosa seo u se batlang. Ha ho hlokahale ak'haonte.

2
Tobetsa ho theha

AI ea rona e sebetsana le kopo ea hau ka metsotsoana ka ho sebelisa li-models tse ntlehali tsa open-source.

3
Tlosa & & arolelana

Kopitsa, kenya kapa arolelana litlamorao tsa hau. Haholo-holo bakeng sa ho sebelisana le batho ba bang le ho rekisa.

Senya sesebelisoa ka API

E-ba le sesebelisoa sena ka ho iketsetsa ho tloha ho kotloloho. OpenAI-compatible REST endpoint, Bearer-token auth, ha ho hlokahale SDK e eketsehileng. Litheko tsa token li lumellana le interface ea webosaeteng.

curl -X POST https://api.free.ai/v1/chat/ \
  -H "Authorization: Bearer sk-free-..." \
  -H "Content-Type: application/json" \
  -d '{"model": "qwen7b", "messages": [{"role": "user", "content": "Use the arXiv PDF Extractor tool on: ..."}]}'

arXiv PDF Extractor — FAQ

E tlisa ho arXiv preprint'me AI e fetola lebokose lohle ho lengolo le hloekileng le nang le LaTeX. Li-equations li khutlela ho LaTeX e nepahetseng, li-layout tse ngata tsa lebokose li buloa, li-references li sa senyehe. E entsoe ka Meta Nougat, e koetlisitsoe ka ho khetheha ho limilione tsa li-arXiv.

Nougat's training corpus was arXiv preprints — so it absolutely shines on the IEEE / ACM / NeurIPS / ICML / arXiv layout family. Other PDF extractors choke on multi-column mathematics; this one was designed for it.

Kopitsa PDF ho tloha arXiv (mohlala, arxiv.org/pdf/2401.12345), e romelle mona, u fumane faele e le 'ngoe ea.txt le lengolo le felletseng e le lengolo le nang le LaTeX. Ha ho hlokahale letšoao la arXiv API; re hloka feela PDF.

E-na — ke eona e leng tšoao ea sehlooho. Maths e ka hare ke `$...$`, maths e bonts'itsoeng ke `$$...$$`. Leha ho le joalo li-equations tse raster-rendered ka libuka tse fetileng li ka tsoa ka ho nepahetseng hobane moelelo o sebetsana le leqephe le leng le le leng e le setšoantšo.

Ho sebetsana ka ho toba. Sebōpeho sa IEEE sa li-colon tse peli ke sebōpeho sa arXiv se tloaelehileng haholo'me Nougat se e tlohela ho bala ka mokhoa o nepahetseng ntle le letšoao la ho hlophisa.

Ha ho joalo — li-markers tsa `[12]` / `[Smith2020]` li lula moo li leng teng,'me lethathamo le felletseng la litlhaloso ka morao le tlosoa ka botlalo bakeng sa ho sebelisoa ha BibTeX / Zotero.

~ 8-15 sec / leqephe. Leqephe la 12 la lengolo la kopano le nka ~ 2-3 min. NeurIPS-style 30 + lingoloa tsa leqephe le li-appendices: 8-12 min.

300 tokens / leqephe, floor 600. arXiv litokomane tse ngata tsa kopano (8-15 lihlooho) ke 2,400-4,500 tokens. Daily free pool covers ~ 1-2 papers / letsatsi bakeng sa ba sebelisang-ka; lichelete tse lefelloeng fumana unlimited.

E tlisetsa ChatGPT / Claude bakeng sa "ho hlalosa poso ena", ho theha RAG ea hau ea botho ka lihlooho tse bolokiloeng, ho batla ka ho hlaka lenaneng la hau la ho bala, ho kopitsa li-equations ho projeke ea hau ea LaTeX, kapa ho bala poso e le mongolo o hlakileng ho mohala oa hau.

Yeah — Nougat OCRs kahare. arXiv e bile LaTeX-rendered bakeng sa lilemo tse 25+ kahoo li-preprints tse ngata li na le li-digital tse hloekileng. Libuka tse fetileng tse skewered li sebetsa empa botsitso ba lipalo bo fokotseha; rescan ho 300+ DPI bakeng sa litlamorao tse ntlehali.

PDFs e tlosoa kapele kamora ho tlosoa. LaTeX output e bolokiloe 24h (anonymize) / 7 days (paid share link). Ha e sebelisetsoe ho ithuta. arXiv PDFs ke CC-BY ea sechaba leha ho le joalo, empa re sa li boloke ka tsela efe kapa efe.

Ea - POST multipart `faele` ho /v1/document/academic-pdf /. JSON karabo le `text_url`, `lihlooho`, `preview`, `tokens`, `share_url`. Bearer auth (sk-free-…) fana ka 10K mahala tokens / khoeli. /api/ bakeng sa curl mohlala.

Ngolisa mahala bakeng sa 10,000 tokens

E-ba le ak'haonte

Ha ho hlokahale karete ea mokitlane

U tla lekola eng ka sesebelisoa sena?

U rata Free.ai? Reka le metsoalle ea hau!