<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title>Tesseract - Tag - vo.rs</title><link>https://vo.rs/tags/tesseract/</link><description>Tesseract - Tag - vo.rs</description><generator>Hugo -- gohugo.io</generator><language>en</language><copyright>This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.</copyright><lastBuildDate>Mon, 05 Jan 2026 11:00:00 +0000</lastBuildDate><atom:link href="https://vo.rs/tags/tesseract/" rel="self" type="application/rss+xml"/><item><title>OCR Pipelines: Tesseract, PaddleOCR, and When to Use Which</title><link>https://vo.rs/story/ocr-pipelines-tesseract-paddleocr-and-when-to-use-which/</link><description>&lt;p&gt;I have a filing cabinet&amp;rsquo;s worth of scanned documents — receipts, manuals, the occasional important letter from an institution that still believes in paper — and a stubborn refusal to feed them through a cloud OCR service that charges per page and keeps a copy. So I run optical character recognition locally. The good news is that self-hosted OCR has quietly become excellent. The bad news is there are two serious contenders, they&amp;rsquo;re good at different things, and the internet is full of people insisting their favourite is universally best. It isn&amp;rsquo;t. Let me save you the afternoon I lost finding out.&lt;/p&gt;</description><pubDate>Mon, 05 Jan 2026 11:00:00 +0000</pubDate></item></channel></rss>