EnglishFrenchSpanish

Ad


OnWorks favicon

djvu2hocr - Online in the Cloud

Run djvu2hocr in OnWorks free hosting provider over Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

This is the command djvu2hocr that can be run in the OnWorks free hosting provider using one of our multiple free online workstations such as Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

PROGRAM:

NAME


djvu2hocr - DjVu to hOCR converter

SYNOPSIS


djvu2hocr [option...] djvu-file

djvu2hocr {--version | --help | -h}

DESCRIPTION


djvu2hocr converts hidden text from a DjVu file to the hOCR[1] format.

OPTIONS


Input selection options
-p, --pages=page-range
Specifies pages to covert. page-range is a comma-separated list of sub-ranges. Each
sub-range is either a single page (e.g. 17) or a contiguous range of pages
(e.g. 37-42). Pages are numbered from 1.

The default is to convert all pages.

Text segmentation options
--word-segmentation=simple
Use the same word segmentation as found in the DjVu file.

This is the default.

--word-segmentation=uax29
Use the Unicode Text Segmentation[2] algorithm to break lines into words, possibly
fixing word segmentation found in the DjVu file.

HTML output options
--title=title
Specifies the document title.

The default is “DjVu hidden text layer”.

--css=style
Add the specfied CSS style to the document.

For example, --css='.ocrx_line { display: block; }' can be used to visually preserve
line breaks.

Other options
--version
Output version information and exit.

-h, --help
Display help and exit.

PORTABILITY


djvu2hocr uses a custom extension to hOCR to retain characters which cannot be directly
represented in an HTML/XML document. For example, control character BEL (^G, U+0007), is
converted into the following HTML chunk: <span class="djvu_char" title="#x07"> </span>

Use djvu2hocr online using onworks.net services


Free Servers & Workstations

Download Windows & Linux apps

Linux commands

Ad