Report Bug / Feature Request

PDF OCR

Model:
Turn a scanned PDF into selectable, searchable text. Rasterizes each page at 200 DPI and runs OCR — pick Tesseract (fast, free), GOT-OCR2 (better for multi-column + tables), or premium fal OCR (best for degraded / historic scans). Output as .txt, .docx, or page-labelled text. Got an image, not a PDF? Use /ocr/ →

Drag a scanned PDF here or click to upload

PDF up to 50 MB. Works on scanned + image-only PDFs. Already text-selectable? You do not need OCR.

·
This PDF already has selectable text — you may not need OCR. Try Summarizer →
Password-protected — unlock it first. Unlock PDF →
Tesseract for clean printed scans. GOT-OCR2 for multi-column papers + Asian scripts. Premium only when every pixel matters.
Language hint speeds Tesseract 2-3× and boosts accuracy 30-50% on non-Latin scripts.
Token estimate
Buy tokens
Upload a PDF to see the exact cost.
OCR result
Юклаб олиш

        

When to use PDF OCR

Invoices + receipts

Turn a stack of scanned invoices into editable text for bookkeeping, reimbursements, or AP automation.

Old books + archives

Digitize out-of-print books, historic documents, or library archives into searchable text.

Legal + medical records

Unlock text inside scanned case files, medical reports, or faxed documents for discovery + eDiscovery workflows.

Before translation

Run OCR to extract text, then feed the result into /pdf/translator/ for any-to-any language conversion.

Free.ai PDF OCR vs the paid tools

Free.aiAdobe Acrobat ProABBYY FineReaderSmallpdf
Starting priceFree$19.99/mo$199 one-time$9/mo
Pick OCR engine (3)
Languages100+40+200+20+
Public API
Works without signup
Advanced options
Натижа
Tokens running low. Get More Tokens
Want better results? Premium models (GPT-5, Claude, Gemini) deliver higher quality. View Plans

❤️ Love Free.ai? Tell your friends!

Sign up to get a referral link and earn 25,000 tokens per friend.

Яна кўпроқ хоҳлайсанми? Sign up free for 10,000 tokens
Бепул рўйхатдан ўтиш

Сизнинг илтимосингиз ишланмоқда...

Turn scanned PDFs into selectable, searchable text. Tesseract, GOT-OCR2, and premium OCR engines. 99 languages. Free.

Қўллаш усули PDF OCR

1
Ўзингизнинг киритмани киритинг

Матн ёзинг, файл юкланг ёки нимани хоҳлаётганингизни айтинг. Ҳисоб керак эмас.

2
Юклаб олишни босинг

Бизнинг ИИ сизнинг талабингизни энг яхши очиқ манбали моделларни қўллаган ҳолда сониялар ичида ишлайди.

3
Юклаб олиш ва улаш

Натижаларни юклаб олиш, нусха кўчириш ёки ўртоқлашиш. Шахсий ва бизнес мақсадлар учун бепул.

Use this tool via API

Automate this tool from your own code. OpenAI-compatible REST endpoint, Bearer-token auth, no extra SDK required. Token costs match the web interface.

curl -X POST https://api.free.ai/v1/pdf/extract/ \
  -H "Authorization: Bearer sk-free-..." \
  -H "Content-Type: application/json" \
  -d '{"file": "@document.pdf"}'

PDF OCR — FAQ

Turn scanned PDFs (image-only, unselectable text) into searchable, copyable, AI-readable text. Upload a PDF of scanned pages — invoices, contracts, old books, handwritten notes — and get back plain text, .docx, or page-numbered text.

About 500 tokens per page on our self-hosted Tesseract OCR (free-tier). GOT-OCR2 costs ~800/page and handles complex layouts (multi-column, tables, equations) better. Premium fal OCR runs ~2,500/page for the highest accuracy on degraded scans. 800-token floor on all three so 1-page uploads are not pure overhead.

Tesseract (default, free) for clean printed scans. GOT-OCR2 (free, slower) for multi-column research papers, scientific equations, or Asian-language documents — it has a much better layout model. Premium fal OCR only for degraded / low-resolution / historic scans where every pixel matters.

Tesseract supports 100+ languages including Latin, Chinese, Japanese, Korean, Arabic, Cyrillic, Hebrew, Hindi, Thai, Vietnamese. Pass a language hint (eng/spa/fra/deu/jpn/chi_sim/ara/etc.) for 30-50% better accuracy — auto-detect works but is slower.

Three: .txt (plain, one file for the whole document), .docx (Word with page breaks preserved), and page-wise .txt (===== Page N ===== headers between pages so you can find a specific page in the output).

On clean 300-DPI scans, 99%+ on printed Latin script. Drops to 90-95% on 150-DPI scans, faded documents, or handwriting. Non-Latin scripts (Chinese, Arabic, Hindi) do much better with a language hint and GOT-OCR2 rather than Tesseract.

Handwriting OCR is much harder than printed text — expect 70-85% accuracy even on the premium OCR. For clean hand-printed forms (not cursive) on white paper, Tesseract is usable. For cursive / messy notes, premium fal OCR with the handwriting mode is more reliable.

About 1-3 seconds per page on Tesseract, 3-5 seconds on GOT-OCR2, 5-10 seconds on premium fal. A 50-page scanned book takes 1-3 minutes — the heavy-task banner lets you close the tab and get the result emailed when it finishes.

No. Uploaded PDFs are rasterized, OCRed page by page, then deleted. The output file is kept on our CDN for 24 hours (7 days on a paid plan) at the share link.

Not directly. Unlock it at /pdf/unlock/ first (if you have the password), then come back. We detect encryption on upload and link you to the unlock tool.

Free.ai is free, offers 3 OCR engines (pick by doc type), supports 99 languages, and ships a public API. Adobe Acrobat Pro ($19.99/mo, 1 engine), ABBYY FineReader ($199 one-time, no API), Smallpdf ($9/mo, 1 engine). Our OCR quality on clean scans matches Adobe; on historic scans premium fal rivals ABBYY.

Yes — POST multipart to /v1/pdf/ocr/ with file + language + model + output_format fields. Quote endpoint: /v1/pdf/ocr-quote/?pages=N&model=M. Full curl recipe at /api/.

Sign up free for 10,000 tokens

Бепул ҳисоб яратиш

Кредит картаси талаб этилмайди

How would you rate this tool?

Like this tool? Share it!