OCR — Image to Text Extractor
Drop an image and pull the text out of it — English, Portuguese, or Spanish — without uploading anywhere.
What this tool does
Optical Character Recognition (OCR) turns text inside an image — a screenshot, a photographed contract, a whiteboard snap, a scanned receipt, a book page — back into selectable, copyable, searchable text. Drop the image, pick the language, and the recognized text appears in seconds, ready to paste into your document or notes. The image and the recognized text never leave your device — there is no upload, no copy of your file held on a third-party server, no logging. That privacy guarantee matters because the documents people most often run through an OCR are exactly the ones you should not paste into a random online tool: IDs, passports, contracts, medical forms, payslips, tax letters, screenshots of internal apps. Pick the language that matches your image (English, Portuguese, or Spanish) — recognition accuracy drops sharply when the wrong model is used. The output is editable in place: you can correct any classic OCR confusions (0 vs O, 1 vs l vs I, m vs rn) before copying or downloading. Optionally enable per-word confidence so each word is tagged with how certain the engine is — handy for quickly spotting which parts of a low-quality scan still need a human eye.
How to use it
- Drop the image — Screenshots and clean scans work best. Photos of documents work too if the lighting is even and the camera held straight.
- Pick the language — Match the language of the text in the image. Each model is downloaded once and cached. Mismatched models give nonsense.
- Extract — Click Extract text. First run downloads the engine and the language model — subsequent runs of the same language are fast.
- Edit, copy, download — The output box is editable. Fix any errors, then copy or download as a .txt file.
How OCR works (in 200 words)
Modern OCR works in five steps. First the image is binarized — turned into pure black-and-white so the engine can tell ink from background regardless of paper color or shadow. Second, connected pixels are grouped into shapes, then into words and lines following the natural reading flow of the page. Third, each word is segmented into individual character candidates. Fourth, those candidates are fed through a neural network trained specifically on the chosen language, which is why picking the right language matters so much: the same letterform can be the most likely match in English and a different letter entirely in Portuguese or Spanish. Fifth, a language model looks at the whole word in context and picks the most plausible reading from a dictionary of common forms — that is what catches confusions like ofice being silently corrected to office. The per-word confidence score is the engine's own self-reported certainty for each word; very high scores are almost always correct, low scores are where you should glance at the original.
What works well, what doesn't
Great: clean PDF screenshots, well-lit scans of typed pages, screen captures of articles, printed book pages photographed straight on. OK: photographed printed pages with even lighting, slightly skewed scans (under 5°), receipts in good shape, signage shot at moderate angles. Poor: handwriting (the engine is trained on print, not cursive), heavily rotated or warped pages, low-light photos, very compressed JPEGs full of noise, decorative or stylized fonts, very small text (under about 10 pixels tall). For tough images, increase the resolution before running OCR — sharp, well-lit pixels matter much more than file size, and a 1500-pixel-wide crop usually beats a blurry 4K original.
EN
PT
ES