Optical Character Recognition in C# using Tesseract

Optical Character Recognition in C# using Tesseract

In this post, I’ll demonstrate how to use Tesseract to build an Optical Character Recognition (OCR) application in C#.

In my recent post about OCR in C#, I used Puma.NET to create the OCR application.

Optical Character Recognition (OCR) in C# - Mishel
OCR is the process of converting printed or handwritten text to machie-encoded text. This post will help you to create an OCR application in C#.

The main drawbacks of using Puma.NET were:

  • Less accurate
  • Puma.NET should be installed on the machine.
  • Requires older versions on .NET.

Creating an OCR application in C# using Tesseract

  • Open Visual Studio and create a new C# Console application.
  • Open the Package Manager Console and install the Tesseract nuget package.

Install-Package Tesseract

If you hate typing commands, Right-click on the project in the solution explorer and select Manage NuGet Packages… ->Click on Online tab and search Tesserect->Click install.

This will add Tesseract and other binaries to the project.

  • Next, we should add language files. You can get these English language files from here. Create a folder tessdata in the Debug folder of your project and copy the language files to it.
  • Finally, add the C# code and run the project.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using Tesseract;
namespace TesserectOCR
{
    class Program
    {
        static void Main(string[] args)
        {
            var ocrengine = new TesseractEngine(@".\tessdata", "eng", EngineMode.Default);
            var img = Pix.LoadFromFile(@"E:\Capture.png");
            var res = ocrengine.Process(img);
            Console.WriteLine(res.GetText());
            Console.ReadKey();
        }
    }
}

Possible errors

You may get the following error when running the project.

The type ‘System.Drawing.Bitmap’ is defined in an assembly that is not referenced. You must add a reference to assembly ‘System.Drawing, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a’.

To fix this, Go to Solution Explorer -> Right-click on References -> Add Reference -> Search Drawing -> Select System.Drawing (A checkmark will appear on the left side if selected) from the result and click OK.

Share Tweet Send
0 Comments
Loading...
You've successfully subscribed to GeekInsta
Great! Next, complete checkout for full access to GeekInsta
Welcome back! You've successfully signed in
Success! Your account is fully activated, you now have access to all content.