报告错误/功能要求

PDF OCR

Model:
Turn a scanned PDF into selectable, searchable text. Rasterizes each page at 200 DPI and runs OCR — pick Tesseract (fast, free), GOT-OCR2 (better for multi-column + tables), or premium fal OCR (best for degraded / historic scans). Output as .txt, .docx, or page-labelled text. Got an image, not a PDF? Use /ocr/ →

Drag a scanned PDF here or click to upload

PDF up to 50 MB. Works on scanned + image-only PDFs. Already text-selectable? You do not need OCR.

·
This PDF already has selectable text — you may not need OCR. Try Summarizer →
Password-protected — unlock it first. Unlock PDF →
Tesseract for clean printed scans. GOT-OCR2 for multi-column papers + Asian scripts. Premium only when every pixel matters.
Language hint speeds Tesseract 2-3× and boosts accuracy 30-50% on non-Latin scripts.
Token estimate
Buy tokens
Upload a PDF to see the exact cost.
OCR result
下载下载

        

When to use PDF OCR

Invoices + receipts

Turn a stack of scanned invoices into editable text for bookkeeping, reimbursements, or AP automation.

Old books + archives

Digitize out-of-print books, historic documents, or library archives into searchable text.

Legal + medical records

Unlock text inside scanned case files, medical reports, or faxed documents for discovery + eDiscovery workflows.

Before translation

Run OCR to extract text, then feed the result into /pdf/translator/ for any-to-any language conversion.

Free.ai PDF OCR vs the paid tools

Free.aiAdobe Acrobat ProABBYY FineReaderSmallpdf
Starting priceFree$19.99/mo$199 one-time$9/mo
Pick OCR engine (3)
Languages100+40+200+20+
Public API
Works without signup
Advanced options
结果成果成果成果成果成果成果成果成果成果成果
Tokens running low. Get More Tokens
Want better results? Premium models (GPT-5, Claude, Gemini) deliver higher quality. View Plans

❤️ Love this tool? Share it!

Sign up to get a referral link and earn 25,000 tokens per friend.

还要吗? Sign up free for 10,000 tokens
签署自由

处理您的请求...

Turn scanned PDFs into selectable, searchable text. Tesseract, GOT-OCR2, and premium OCR engines. 99 languages. Free.

如何使用 PDF OCR

1
输入输入

键入文本、上传文件或描述您想要的东西。不需要账户 。

2
点击生成

我们的人工智能使用最佳的开放源码模型,在秒内处理你的请求。

3
下载共享( S)

下载、复制或分享您的结果。 免费个人和商业使用 。

Use this tool via API

Automate this tool from your own code. OpenAI-compatible REST endpoint, Bearer-token auth, no extra SDK required. Token costs match the web interface.

curl -X POST https://api.free.ai/v1/pdf/extract/ \
  -H "Authorization: Bearer sk-free-..." \
  -H "Content-Type: application/json" \
  -d '{"file": "@document.pdf"}'

PDF OCR — FAQ

Turn scanned PDFs (image-only, unselectable text) into searchable, copyable, AI-readable text. Upload a PDF of scanned pages — invoices, contracts, old books, handwritten notes — and get back plain text, .docx, or page-numbered text.

About 500 tokens per page on our self-hosted Tesseract OCR (free-tier). GOT-OCR2 costs ~800/page and handles complex layouts (multi-column, tables, equations) better. Premium fal OCR runs ~2,500/page for the highest accuracy on degraded scans. 800-token floor on all three so 1-page uploads are not pure overhead.

Tesseract (default, free) for clean printed scans. GOT-OCR2 (free, slower) for multi-column research papers, scientific equations, or Asian-language documents — it has a much better layout model. Premium fal OCR only for degraded / low-resolution / historic scans where every pixel matters.

Tesseract supports 100+ languages including Latin, Chinese, Japanese, Korean, Arabic, Cyrillic, Hebrew, Hindi, Thai, Vietnamese. Pass a language hint (eng/spa/fra/deu/jpn/chi_sim/ara/etc.) for 30-50% better accuracy — auto-detect works but is slower.

Three: .txt (plain, one file for the whole document), .docx (Word with page breaks preserved), and page-wise .txt (===== Page N ===== headers between pages so you can find a specific page in the output).

On clean 300-DPI scans, 99%+ on printed Latin script. Drops to 90-95% on 150-DPI scans, faded documents, or handwriting. Non-Latin scripts (Chinese, Arabic, Hindi) do much better with a language hint and GOT-OCR2 rather than Tesseract.

Handwriting OCR is much harder than printed text — expect 70-85% accuracy even on the premium OCR. For clean hand-printed forms (not cursive) on white paper, Tesseract is usable. For cursive / messy notes, premium fal OCR with the handwriting mode is more reliable.

About 1-3 seconds per page on Tesseract, 3-5 seconds on GOT-OCR2, 5-10 seconds on premium fal. A 50-page scanned book takes 1-3 minutes — the heavy-task banner lets you close the tab and get the result emailed when it finishes.

No. Uploaded PDFs are rasterized, OCRed page by page, then deleted. The output file is kept on our CDN for 24 hours (7 days on a paid plan) at the share link.

Not directly. Unlock it at /pdf/unlock/ first (if you have the password), then come back. We detect encryption on upload and link you to the unlock tool.

Free.ai is free, offers 3 OCR engines (pick by doc type), supports 99 languages, and ships a public API. Adobe Acrobat Pro ($19.99/mo, 1 engine), ABBYY FineReader ($199 one-time, no API), Smallpdf ($9/mo, 1 engine). Our OCR quality on clean scans matches Adobe; on historic scans premium fal rivals ABBYY.

Yes — POST multipart to /v1/pdf/ocr/ with file + language + model + output_format fields. Quote endpoint: /v1/pdf/ocr-quote/?pages=N&model=M. Full curl recipe at /api/.

Sign up free for 10,000 tokens

创建自由账户

无需信用卡

你会如何评分这个工具?

Love this tool? Share it!