Optical Character Recognition for .NET

OCR.NET
OCR.NET Demo
Use OCR.NET component to retrieve text from image, for example from scanned paper document.
  • uses Tesseract OCR engine and Leptonica image processing library
  • available for .NET Framework 4.8 and .NET 8
  • source code included in registered version
  • royalty free distribution in applications

Download and order

Download Tesseract language data and place to tessdata folder.
Order OCR.NET component $100 USD (single developer license)
Order OCR.NET multi-license $300 USD (license for all developers in company)
Order OCR.NET year upgrades $50 USD (registered users only)
Order OCR.NET year upgrades multi-license $150 USD (registered multi-license users only)

FAQ

An unhandled exception of type 'System.BadImageFormatException' occurred in Winsoft.Ocr.dll
It's caused by using incorrect ocr.dll library, i.e. 32-bit instead of 64-bit or vice versa.
Select x86 or x64 platform in Visual Studio accordingly to the ocr.dll library and rebuild your application.
Or place proper ocr.dll to the folder where is your application exe file located.
32-bit ocr.dll is located in subfolder DLL\32bit
64-bit ocr.dll is located in subfolder DLL\64bit

How can I solve "Cannot initialize Tesseract library" error?
Set ocr.DataPath property to the folder containing Tesseract language data.

How can I increase OCR speed?
Use Tesseract language data from tessdata_fast repository.

How can I increase OCR accuracy?
Use Tesseract language data from tessdata_best repository.

How can I improve OCR output?
Improving the quality of the output

Related links