버그 보고 / 기능 요청

PDF OCR

Model:
Turn a scanned PDF into selectable, searchable text. Rasterizes each page at 200 DPI and runs OCR — pick Tesseract (fast, free), GOT-OCR2 (better for multi-column + tables), or premium fal OCR (best for degraded / historic scans). Output as .txt, .docx, or page-labelled text. Got an image, not a PDF? Use /ocr/ →

Drag a scanned PDF here or click to upload

PDF up to 50 MB. Works on scanned + image-only PDFs. Already text-selectable? You do not need OCR.

·
This PDF already has selectable text — you may not need OCR. Try Summarizer →
Password-protected — unlock it first. Unlock PDF →
Tesseract for clean printed scans. GOT-OCR2 for multi-column papers + Asian scripts. Premium only when every pixel matters.
Language hint speeds Tesseract 2-3× and boosts accuracy 30-50% on non-Latin scripts.
Token estimate
Buy tokens
Upload a PDF to see the exact cost.
OCR result
다운로드

        

When to use PDF OCR

Invoices + receipts

Turn a stack of scanned invoices into editable text for bookkeeping, reimbursements, or AP automation.

Old books + archives

Digitize out-of-print books, historic documents, or library archives into searchable text.

Legal + medical records

Unlock text inside scanned case files, medical reports, or faxed documents for discovery + eDiscovery workflows.

Before translation

Run OCR to extract text, then feed the result into /pdf/translator/ for any-to-any language conversion.

Free.ai PDF OCR vs the paid tools

Free.aiAdobe Acrobat ProABBYY FineReaderSmallpdf
Starting priceFree$19.99/mo$199 one-time$9/mo
Pick OCR engine (3)
Languages100+40+200+20+
Public API
Works without signup
Advanced options
결과
Tokens running low. Get More Tokens
Want better results? Premium models (GPT-5, Claude, Gemini) deliver higher quality. View Plans

❤️ Love this tool? Share it!

Sign up to get a referral link and earn 25,000 tokens per friend.

더 먹고 싶어? Sign up free for 10,000 tokens
무료로 가입하세요

귀하의 요청을 처리 중...

Turn scanned PDFs into selectable, searchable text. Tesseract, GOT-OCR2, and premium OCR engines. 99 languages. Free.

사용 방법 PDF OCR

1
입력을 입력하십시오

텍스트를 입력하거나 파일을 업로드하거나 원하는 내용을 설명하세요. 계정이 필요하지 않습니다.

2
생성하기를 클릭하십시오

당사의 AI는 최고의 오픈 소스 모델을 사용하여 몇 초 만에 요청을 처리합니다.

3
다운로드 및 공유

다운로드, 복사 또는 결과를 공유. 개인 및 상업용 무료.

Use this tool via API

Automate this tool from your own code. OpenAI-compatible REST endpoint, Bearer-token auth, no extra SDK required. Token costs match the web interface.

curl -X POST https://api.free.ai/v1/pdf/extract/ \
  -H "Authorization: Bearer sk-free-..." \
  -H "Content-Type: application/json" \
  -d '{"file": "@document.pdf"}'

PDF OCR — FAQ

Turn scanned PDFs (image-only, unselectable text) into searchable, copyable, AI-readable text. Upload a PDF of scanned pages — invoices, contracts, old books, handwritten notes — and get back plain text, .docx, or page-numbered text.

About 500 tokens per page on our self-hosted Tesseract OCR (free-tier). GOT-OCR2 costs ~800/page and handles complex layouts (multi-column, tables, equations) better. Premium fal OCR runs ~2,500/page for the highest accuracy on degraded scans. 800-token floor on all three so 1-page uploads are not pure overhead.

Tesseract (default, free) for clean printed scans. GOT-OCR2 (free, slower) for multi-column research papers, scientific equations, or Asian-language documents — it has a much better layout model. Premium fal OCR only for degraded / low-resolution / historic scans where every pixel matters.

Tesseract supports 100+ languages including Latin, Chinese, Japanese, Korean, Arabic, Cyrillic, Hebrew, Hindi, Thai, Vietnamese. Pass a language hint (eng/spa/fra/deu/jpn/chi_sim/ara/etc.) for 30-50% better accuracy — auto-detect works but is slower.

Three: .txt (plain, one file for the whole document), .docx (Word with page breaks preserved), and page-wise .txt (===== Page N ===== headers between pages so you can find a specific page in the output).

On clean 300-DPI scans, 99%+ on printed Latin script. Drops to 90-95% on 150-DPI scans, faded documents, or handwriting. Non-Latin scripts (Chinese, Arabic, Hindi) do much better with a language hint and GOT-OCR2 rather than Tesseract.

Handwriting OCR is much harder than printed text — expect 70-85% accuracy even on the premium OCR. For clean hand-printed forms (not cursive) on white paper, Tesseract is usable. For cursive / messy notes, premium fal OCR with the handwriting mode is more reliable.

About 1-3 seconds per page on Tesseract, 3-5 seconds on GOT-OCR2, 5-10 seconds on premium fal. A 50-page scanned book takes 1-3 minutes — the heavy-task banner lets you close the tab and get the result emailed when it finishes.

No. Uploaded PDFs are rasterized, OCRed page by page, then deleted. The output file is kept on our CDN for 24 hours (7 days on a paid plan) at the share link.

Not directly. Unlock it at /pdf/unlock/ first (if you have the password), then come back. We detect encryption on upload and link you to the unlock tool.

Free.ai is free, offers 3 OCR engines (pick by doc type), supports 99 languages, and ships a public API. Adobe Acrobat Pro ($19.99/mo, 1 engine), ABBYY FineReader ($199 one-time, no API), Smallpdf ($9/mo, 1 engine). Our OCR quality on clean scans matches Adobe; on historic scans premium fal rivals ABBYY.

Yes — POST multipart to /v1/pdf/ocr/ with file + language + model + output_format fields. Quote endpoint: /v1/pdf/ocr-quote/?pages=N&model=M. Full curl recipe at /api/.

Sign up free for 10,000 tokens

무료 계정 만들기

신용카드 필요 없음

이 도구를 어떻게 평가하시겠습니까?

Love this tool? Share it!