Wikisource:OCR - Wikisource
Jump to content
From Wikisource
See also:
Wikisource:ProofreadPage#Text layer extraction from djvu/pdf file
This
observatory of OCR systems
lists known optical character recognition (OCR) systems which could be useful to Wikimedians. All systems — open, free or paid — are relevant to be listed and documented below. If you have used an effective OCR system, please list it below (optionally with some comments).
Commons.js
edit
Wikisource:Google OCR
(old)
Wikisource:Tesseract OCR
(new)
Extension
edit
Section to expand.
mw:Help:Extension:Wikisource/Wikimedia OCR
. Based on Wikimedia's Google OCR and Tesseract OCR cited above.
Free
edit
Online and free
edit
Section to expand.
Wikimedia
. Based on Wikimedia's Google OCR and Tesseract OCR cited above. Image input only (no pdf).
Other free systems
edit
Kraken
("optimized for historical and non-Latin script material")
models for 17th century French, see
[1]
catalog of several training sets for various languages and types of documents:
Paid system
edit
Section to expand.
Retrieved from "
Wikisource
OCR
Add topic