<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <title>Leptonica Library on Producthunt daily</title>
        <link>https://producthunt.programnotes.cn/en/tags/leptonica-library/</link>
        <description>Recent content in Leptonica Library on Producthunt daily</description>
        <generator>Hugo -- gohugo.io</generator>
        <language>en</language>
        <lastBuildDate>Thu, 11 Sep 2025 15:31:10 +0800</lastBuildDate><atom:link href="https://producthunt.programnotes.cn/en/tags/leptonica-library/index.xml" rel="self" type="application/rss+xml" /><item>
        <title>tesseract</title>
        <link>https://producthunt.programnotes.cn/en/p/tesseract/</link>
        <pubDate>Thu, 11 Sep 2025 15:31:10 +0800</pubDate>
        
        <guid>https://producthunt.programnotes.cn/en/p/tesseract/</guid>
        <description>&lt;img src="https://images.unsplash.com/photo-1636114673156-052a83459fc1?ixid=M3w0NjAwMjJ8MHwxfHJhbmRvbXx8fHx8fHx8fDE3NTc1NzU3Mzd8&amp;ixlib=rb-4.1.0" alt="Featured image of post tesseract" /&gt;&lt;h1 id=&#34;tesseract-ocrtesseract&#34;&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/tesseract-ocr/tesseract&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;tesseract-ocr/tesseract&lt;/a&gt;
&lt;/h1&gt;&lt;h1 id=&#34;tesseract-ocr&#34;&gt;Tesseract OCR
&lt;/h1&gt;&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;https://scan.coverity.com/projects/tesseract-ocr&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;&lt;img src=&#34;https://scan.coverity.com/projects/tesseract-ocr/badge.svg&#34;
	
	
	
	loading=&#34;lazy&#34;
	
		alt=&#34;Coverity Scan Build Status&#34;
	
	
&gt;&lt;/a&gt;
&lt;a class=&#34;link&#34; href=&#34;https://github.com/tesseract-ocr/tesseract/security/code-scanning&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;&lt;img src=&#34;https://github.com/tesseract-ocr/tesseract/workflows/CodeQL/badge.svg&#34;
	
	
	
	loading=&#34;lazy&#34;
	
		alt=&#34;CodeQL&#34;
	
	
&gt;&lt;/a&gt;
&lt;a class=&#34;link&#34; href=&#34;https://issues.oss-fuzz.com/issues?q=is:open%20title:tesseract-ocr&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;&lt;img src=&#34;https://img.shields.io/badge/oss--fuzz-fuzzing-brightgreen&#34;
	
	
	
	loading=&#34;lazy&#34;
	
		alt=&#34;OSS-Fuzz&#34;
	
	
&gt;&lt;/a&gt;
&lt;br&gt;
&lt;a class=&#34;link&#34; href=&#34;https://raw.githubusercontent.com/tesseract-ocr/tesseract/main/LICENSE&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;&lt;img src=&#34;https://img.shields.io/badge/license-Apache--2.0-blue.svg&#34;
	
	
	
	loading=&#34;lazy&#34;
	
		alt=&#34;GitHub license&#34;
	
	
&gt;&lt;/a&gt;
&lt;a class=&#34;link&#34; href=&#34;https://github.com/tesseract-ocr/tesseract/releases/&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;&lt;img src=&#34;https://img.shields.io/badge/download-all%20releases-brightgreen.svg&#34;
	
	
	
	loading=&#34;lazy&#34;
	
		alt=&#34;Downloads&#34;
	
	
&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id=&#34;table-of-contents&#34;&gt;Table of Contents
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;#tesseract-ocr&#34; &gt;Tesseract OCR&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;#about&#34; &gt;About&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;#brief-history&#34; &gt;Brief history&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;#installing-tesseract&#34; &gt;Installing Tesseract&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;#running-tesseract&#34; &gt;Running Tesseract&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;#for-developers&#34; &gt;For developers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;#support&#34; &gt;Support&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;#license&#34; &gt;License&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;#dependencies&#34; &gt;Dependencies&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;#latest-version-of-readme&#34; &gt;Latest Version of README&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;about&#34;&gt;About
&lt;/h2&gt;&lt;p&gt;This package contains an &lt;strong&gt;OCR engine&lt;/strong&gt; - &lt;code&gt;libtesseract&lt;/code&gt; and a &lt;strong&gt;command line program&lt;/strong&gt; - &lt;code&gt;tesseract&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Tesseract 4 adds a new neural net (LSTM) based &lt;a class=&#34;link&#34; href=&#34;https://en.wikipedia.org/wiki/Optical_character_recognition&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;OCR engine&lt;/a&gt; which is focused on line recognition, but also still supports the legacy Tesseract OCR engine of Tesseract 3 which works by recognizing character patterns. Compatibility with Tesseract 3 is enabled by using the Legacy OCR Engine mode (&amp;ndash;oem 0).
It also needs &lt;a class=&#34;link&#34; href=&#34;https://tesseract-ocr.github.io/tessdoc/Data-Files.html&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;traineddata&lt;/a&gt; files which support the legacy engine, for example those from the &lt;a class=&#34;link&#34; href=&#34;https://github.com/tesseract-ocr/tessdata&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;tessdata&lt;/a&gt; repository.&lt;/p&gt;
&lt;p&gt;Stefan Weil is the current lead developer. Ray Smith was the lead developer until 2018. The maintainer is Zdenko Podobny. For a list of contributors see &lt;a class=&#34;link&#34; href=&#34;https://github.com/tesseract-ocr/tesseract/blob/main/AUTHORS&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;AUTHORS&lt;/a&gt;
and GitHub&amp;rsquo;s log of &lt;a class=&#34;link&#34; href=&#34;https://github.com/tesseract-ocr/tesseract/graphs/contributors&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;contributors&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Tesseract has &lt;strong&gt;unicode (UTF-8) support&lt;/strong&gt;, and can &lt;strong&gt;recognize &lt;a class=&#34;link&#34; href=&#34;https://tesseract-ocr.github.io/tessdoc/Data-Files-in-different-versions.html&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;more than 100 languages&lt;/a&gt;&lt;/strong&gt; &amp;ldquo;out of the box&amp;rdquo;.&lt;/p&gt;
&lt;p&gt;Tesseract supports &lt;strong&gt;&lt;a class=&#34;link&#34; href=&#34;https://tesseract-ocr.github.io/tessdoc/InputFormats&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;various image formats&lt;/a&gt;&lt;/strong&gt; including PNG, JPEG and TIFF.&lt;/p&gt;
&lt;p&gt;Tesseract supports &lt;strong&gt;various output formats&lt;/strong&gt;: plain text, hOCR (HTML), PDF, invisible-text-only PDF, TSV, ALTO and PAGE.&lt;/p&gt;
&lt;p&gt;You should note that in many cases, in order to get better OCR results, you&amp;rsquo;ll need to &lt;strong&gt;&lt;a class=&#34;link&#34; href=&#34;https://tesseract-ocr.github.io/tessdoc/ImproveQuality.html&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;improve the quality&lt;/a&gt; of the image&lt;/strong&gt; you are giving Tesseract.&lt;/p&gt;
&lt;p&gt;This project &lt;strong&gt;does not include a GUI application&lt;/strong&gt;. If you need one, please see the &lt;a class=&#34;link&#34; href=&#34;https://tesseract-ocr.github.io/tessdoc/User-Projects-%E2%80%93-3rdParty.html&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;3rdParty&lt;/a&gt; documentation.&lt;/p&gt;
&lt;p&gt;Tesseract &lt;strong&gt;can be trained to recognize other languages&lt;/strong&gt;.
See &lt;a class=&#34;link&#34; href=&#34;https://tesseract-ocr.github.io/tessdoc/Training-Tesseract.html&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Tesseract Training&lt;/a&gt; for more information.&lt;/p&gt;
&lt;h2 id=&#34;brief-history&#34;&gt;Brief history
&lt;/h2&gt;&lt;p&gt;Tesseract was originally developed at Hewlett-Packard Laboratories Bristol UK and at Hewlett-Packard Co, Greeley Colorado USA between 1985 and 1994, with some more changes made in 1996 to port to Windows, and some C++izing in 1998. In 2005 Tesseract was open sourced by HP. From 2006 until November 2018 it was developed by Google.&lt;/p&gt;
&lt;p&gt;Major version 5 is the current stable version and started with release
&lt;a class=&#34;link&#34; href=&#34;https://github.com/tesseract-ocr/tesseract/releases/tag/5.0.0&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;5.0.0&lt;/a&gt; on November 30, 2021. Newer minor versions and bugfix versions are available from
&lt;a class=&#34;link&#34; href=&#34;https://github.com/tesseract-ocr/tesseract/releases/&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;GitHub&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Latest source code is available from &lt;a class=&#34;link&#34; href=&#34;https://github.com/tesseract-ocr/tesseract/tree/main&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;main branch on GitHub&lt;/a&gt;.
Open issues can be found in &lt;a class=&#34;link&#34; href=&#34;https://github.com/tesseract-ocr/tesseract/issues&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;issue tracker&lt;/a&gt;,
and &lt;a class=&#34;link&#34; href=&#34;https://tesseract-ocr.github.io/tessdoc/Planning.html&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;planning documentation&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;See &lt;strong&gt;&lt;a class=&#34;link&#34; href=&#34;https://tesseract-ocr.github.io/tessdoc/ReleaseNotes.html&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Release Notes&lt;/a&gt;&lt;/strong&gt;
and &lt;strong&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/tesseract-ocr/tesseract/blob/main/ChangeLog&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Change Log&lt;/a&gt;&lt;/strong&gt; for more details of the releases.&lt;/p&gt;
&lt;h2 id=&#34;installing-tesseract&#34;&gt;Installing Tesseract
&lt;/h2&gt;&lt;p&gt;You can either &lt;a class=&#34;link&#34; href=&#34;https://tesseract-ocr.github.io/tessdoc/Installation.html&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Install Tesseract via pre-built binary package&lt;/a&gt;
or &lt;a class=&#34;link&#34; href=&#34;https://tesseract-ocr.github.io/tessdoc/Compiling.html&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;build it from source&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Before building Tesseract from source, please check that your system has a compiler which is one of the &lt;a class=&#34;link&#34; href=&#34;https://tesseract-ocr.github.io/tessdoc/supported-compilers.html&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;supported compilers&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;running-tesseract&#34;&gt;Running Tesseract
&lt;/h2&gt;&lt;p&gt;Basic &lt;strong&gt;&lt;a class=&#34;link&#34; href=&#34;https://tesseract-ocr.github.io/tessdoc/Command-Line-Usage.html&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;command line usage&lt;/a&gt;&lt;/strong&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;tesseract imagename outputbase [-l lang] [--oem ocrenginemode] [--psm pagesegmode] [configfiles...]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;For more information about the various command line options use &lt;code&gt;tesseract --help&lt;/code&gt; or &lt;code&gt;man tesseract&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Examples can be found in the &lt;a class=&#34;link&#34; href=&#34;https://tesseract-ocr.github.io/tessdoc/Command-Line-Usage.html#simplest-invocation-to-ocr-an-image&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;documentation&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;for-developers&#34;&gt;For developers
&lt;/h2&gt;&lt;p&gt;Developers can use &lt;code&gt;libtesseract&lt;/code&gt; &lt;a class=&#34;link&#34; href=&#34;https://github.com/tesseract-ocr/tesseract/blob/main/include/tesseract/capi.h&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;C&lt;/a&gt; or
&lt;a class=&#34;link&#34; href=&#34;https://github.com/tesseract-ocr/tesseract/blob/main/include/tesseract/baseapi.h&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;C++&lt;/a&gt; API to build their own application. If you need bindings to &lt;code&gt;libtesseract&lt;/code&gt; for other programming languages, please see the
&lt;a class=&#34;link&#34; href=&#34;https://tesseract-ocr.github.io/tessdoc/AddOns.html#tesseract-wrappers&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;wrapper&lt;/a&gt; section in the AddOns documentation.&lt;/p&gt;
&lt;p&gt;Documentation of Tesseract generated from source code by doxygen can be found on &lt;a class=&#34;link&#34; href=&#34;https://tesseract-ocr.github.io/&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;tesseract-ocr.github.io&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;support&#34;&gt;Support
&lt;/h2&gt;&lt;p&gt;Before you submit an issue, please review &lt;strong&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/tesseract-ocr/tesseract/blob/main/CONTRIBUTING.md&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;the guidelines for this repository&lt;/a&gt;&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;For support, first read the &lt;a class=&#34;link&#34; href=&#34;https://tesseract-ocr.github.io/tessdoc/&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;documentation&lt;/a&gt;,
particularly the &lt;a class=&#34;link&#34; href=&#34;https://tesseract-ocr.github.io/tessdoc/FAQ.html&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;FAQ&lt;/a&gt; to see if your problem is addressed there.
If not, search the &lt;a class=&#34;link&#34; href=&#34;https://groups.google.com/g/tesseract-ocr&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Tesseract user forum&lt;/a&gt;, the &lt;a class=&#34;link&#34; href=&#34;https://groups.google.com/g/tesseract-dev&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Tesseract developer forum&lt;/a&gt; and &lt;a class=&#34;link&#34; href=&#34;https://github.com/tesseract-ocr/tesseract/issues&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;past issues&lt;/a&gt;, and if you still can&amp;rsquo;t find what you need, ask for support in the mailing-lists.&lt;/p&gt;
&lt;p&gt;Mailing-lists:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://groups.google.com/g/tesseract-ocr&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;tesseract-ocr&lt;/a&gt; - For tesseract users.&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://groups.google.com/g/tesseract-dev&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;tesseract-dev&lt;/a&gt; - For tesseract developers.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Please report an issue only for a &lt;strong&gt;bug&lt;/strong&gt;, not for asking questions.&lt;/p&gt;
&lt;h2 id=&#34;license&#34;&gt;License
&lt;/h2&gt;&lt;pre&gt;&lt;code&gt;The code in this repository is licensed under the Apache License, Version 2.0 (the &amp;quot;License&amp;quot;);
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

   http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an &amp;quot;AS IS&amp;quot; BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;NOTE&lt;/strong&gt;: This software depends on other packages that may be licensed under different open source licenses.&lt;/p&gt;
&lt;p&gt;Tesseract uses &lt;a class=&#34;link&#34; href=&#34;http://leptonica.com/&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Leptonica library&lt;/a&gt; which essentially
uses a &lt;a class=&#34;link&#34; href=&#34;http://leptonica.com/about-the-license.html&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;BSD 2-clause license&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;dependencies&#34;&gt;Dependencies
&lt;/h2&gt;&lt;p&gt;Tesseract uses &lt;a class=&#34;link&#34; href=&#34;https://github.com/DanBloomberg/leptonica&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Leptonica library&lt;/a&gt;
for opening input images (e.g. not documents like pdf).
It is suggested to use leptonica with built-in support for &lt;a class=&#34;link&#34; href=&#34;https://zlib.net&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;zlib&lt;/a&gt;,
&lt;a class=&#34;link&#34; href=&#34;https://sourceforge.net/projects/libpng&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;png&lt;/a&gt; and
&lt;a class=&#34;link&#34; href=&#34;http://www.simplesystems.org/libtiff&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;tiff&lt;/a&gt; (for multipage tiff).&lt;/p&gt;
&lt;h2 id=&#34;latest-version-of-readme&#34;&gt;Latest Version of README
&lt;/h2&gt;&lt;p&gt;For the latest online version of the README.md see:&lt;/p&gt;
&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;https://github.com/tesseract-ocr/tesseract/blob/main/README.md&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;https://github.com/tesseract-ocr/tesseract/blob/main/README.md&lt;/a&gt;&lt;/p&gt;
</description>
        </item>
        
    </channel>
</rss>
