Notes | Tesseract | P. Envall

I decided to check out the OCR reading software tesseract (and its source code).

Building it from source didn’t exactly “just work”. These are some of the things I had to do:

export TESSDATA_PREFIX=/usr/local/share

$ ./configure LDFLAGS=-L/opt/local/lib/ CFLAGS=-I/opt/local/include/

Download and extract the English language pack (that was said to be included with the standard src, but hey), and copy its data to the “prefix” directory:

$ sudo mkdir /usr/local/share/tessdata  
$ sudo cp tesseract-ocr/tessdata/*.* /usr/local/share/tessdata/

Then, the following worked:

$ tesseract foo.png result