I decided to check out the OCR reading software tesseract (https://code.google.com/p/tesseract-ocr/downloads/detail?name=tesseract-ocr-3.02.02.tar.gz).

Building it from source didn’t exactly “just work”. These are some of the things I had to do:

  • Add a “tesseract prefix” path to my .profile:

export TESSDATA_PREFIX=/usr/local/share

./configure LDFLAGS=-L/opt/local/lib/ CFLAGS=-I/opt/local/include/

sudo mkdir /usr/local/share/tessdata
sudo cp tesseract-ocr/tessdata/*.* /usr/local/share/tessdata/

Then, the following worked:

tesseract foo.png result