Using the ocr4gamera script
The ocr4gamera script takes a picture and already trained data and segments the
Last modified: April 29, 2010
picture to single glyphs. The training-data is than used to classify those glyphs and converts them into strings. The final text is written to standard-out or can optionally be stored in a textfile. Also a word by word correction can be per- formed on the recognized text.
The option -x for the trained xml-file and the picture are essential parameters. The picture always has to be the last argument. So the minimal call looks like this
ocr4gamera -x classifier_glyphs.xml picture.png
The optional parameters are:
-v --verbose for further information printed to standard-out. -h --help for short information about usage -o --output file.txt for printing the recognized text into given file -a --automatic-group for autogrouping the glyphs with the classifier -i --information for dumping information about the segmentation process in png
files
-d --deskew for a skew-correction on the image -f --filter for a filter-correction on the image. Eleminates noise. -D --dictionary-correction for correcting the recognized text word by word. On
default the program 'aspell' is used. If not installed 'ispell' is used.