Tesseract
- tesseract
- ocr
- development
2015-03-01
I decided to check out the OCR reading software tesseract (and its source code).
Building it from source didn’t exactly “just work”. These are some of the things I had to do:
- Add a “tesseract prefix” path to my .profile:
export TESSDATA_PREFIX=/usr/local/share
- Install the dependency Leptonica
- To fix “missing image support”, configure Leptonica like this when building it:
$ ./configure LDFLAGS=-L/opt/local/lib/ CFLAGS=-I/opt/local/include/
- Download and extract the English language pack (that was said to be included with the standard src, but hey), and copy its data to the “prefix” directory:
$ sudo mkdir /usr/local/share/tessdata
$ sudo cp tesseract-ocr/tessdata/*.* /usr/local/share/tessdata/
Then, the following worked:
$ tesseract foo.png result