Tesseract

Tesseract installation depends on lots of other packages, the main one being leptonica. These are the steps that enables you to successfully set it up on Cent OS 5.x and OpenSuse 11.x.

You may use zypper instead of yum on OpenSuse, the instructions and package names remain the same.

Leptonica Installation

1. Install the following packages using yum

$ yum install libjpeg-devel libpng-devel libtiff-devel zlib-devel gcc gcc-c++ make

2. Download Leptonica source 1.67 from http://www.leptonica.com/source/leptonlib-1.67.tar.gz and compile it using the following commands

$ ./configure
$ ./make
$ ./make install

*If you get error while running make in the above step for functions like sqrt, cos, sin, sincos, etc, then you may have to append -lm option to the make file in the src folder of leptonica source code and run the make again

Tesseract Installation

1. Download the Tesseract source code from the  location http://tesseract-ocr.googlecode.com/files/tesseract-3.00.tar.gz

2. Extract the source code into a directory and use the standard commands to compile the code as shown below

$ ./configure
$ ./make
$ ./make install

Post Installation Steps

Some enviornments may need to setup following environment variable to be exported

export LD_LIBRARY_PATH=/usr/local/lib

The english language training data can be downloaded from http://tesseract-ocr.googlecode.com/files/eng.traineddata.gz

After extraction of the language bundle, copy it in /usr/local/share/tessdata folder.

This completes the tesseract installation and now you should be able to run tesseract on linux