

OnWorks favicon

hocr2djvused - Online in the Cloud

Run hocr2djvused in OnWorks free hosting provider over Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

This is the command hocr2djvused that can be run in the OnWorks free hosting provider using one of our multiple free online workstations such as Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator



hocr2djvused - hOCR to djvused script converter


hocr2djvused [option...] [hocr-file...]


hocr2djvused reads one or more hOCR[1] files (as produced by OCRopus[2] or Cuneiform[3] or
Tesseract[4]) and converts them to a djvused script.

Unless a filename is explicitly provided on the command line, hOCR is read from the
standard input.


Text segmentation options
-t lines, --details lines
Record location of every line. Don't record locations of particular words or

-t words, --details=words
Record location of every line and every word. Don't record locations of particular

This is the default.

-t chars, --details=chars
Record location of every line, every word and every character.

Consider each non-empty sequence of non-whitespace characters a single word.

This is the default, despite being linguistically incorrect.

Use the Unicode Text Segmentation[5] algorithm to break lines into words.

This options break assumptions of some DjVu tools that words are separated by spaces,
and therefore is it not recommended.

Other options
Assume that DjVu pages are rotated by n degrees.

Specifies that page size is width pixels × height pixels.

This option is required for hOCR generated by Cuneiform (< 0.8) and superfluous

Use a HTML5 parser[6], which is more robust but slower than the default parser.

Attempt to fix UTF-8 encoding issues and eliminate unwanted control characters.

This option might be needed for hOCR generated by Cuneiform[7] or Tesseract[8].

Output version information and exit.

-h, --help
Display help and exit.

Use hocr2djvused online using onworks.net services

Free Servers & Workstations

Download Windows & Linux apps

Linux commands
