pocketsphinx_batch - Online in the Cloud

Run pocketsphinx_batch in OnWorks free hosting provider over Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

This is the command pocketsphinx_batch that can be run in the OnWorks free hosting provider using one of our multiple free online workstations such as Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

Run in Ubuntu Run in Fedora Run in Windows Sim Run in MACOS Sim

PROGRAM:

NAME

pocketsphinx_batch - Run speech recognition in batch mode

SYNOPSIS

pocketsphinx_batch -hmm hmmdir -dict dictfile [ options ]...

DESCRIPTION

Run speech recognition over a list of utterances in batchmode. A list of arguments
follows:

-adchdr
Size of audio file header in bytes (headers are ignored)

-adcin Input is raw audio data

-agc Automatic gain control for c0 ('max', 'emax', 'noise', or 'none')

-agcthresh
Initial threshold for automatic gain control

-allphone
phoneme decoding with phonetic lm

-allphone_ci
Perform phoneme decoding with phonetic lm and context-independent units only

-alpha Preemphasis parameter

-argfile
file giving extra arguments.

-ascale
Inverse of acoustic model scale for confidence score calculation

-aw Inverse weight applied to acoustic scores.

-backtrace
Print results and backtraces to log file.

-beam Beam width applied to every frame in Viterbi search (smaller values mean wider
beam)

-bestpath
Run bestpath (Dijkstra) search over word lattice (3rd pass)

-bestpathlw
Language model probability weight for bestpath search

-build_outdirs
Create missing subdirectories in output directory

-cepdir
files directory (prefixed to filespecs in control file)

-cepext
Input files extension (suffixed to filespecs in control file)

-ceplen
Number of components in the input feature vector

-cmn Cepstral mean normalization scheme ('current', 'prior', or 'none')

-cmninit
Initial values (comma-separated) for cepstral mean when 'prior' is used

-compallsen
Compute all senone scores in every frame (can be faster when there are many
senones)

-ctl file listing utterances to be processed

-ctlcount
No. of utterances to be processed (after skipping -ctloffset entries)

-ctlincr
Do every Nth line in the control file

-ctloffset
No. of utterances at the beginning of -ctl file to be skipped

-ctm output in CTM file format (may require post-sorting)

-debug level for debugging messages

-dict pronunciation dictionary (lexicon) input file

-dictcase
Dictionary is case sensitive (NOTE: case insensitivity applies to ASCII characters
only)

-dither
Add 1/2-bit noise

-doublebw
Use double bandwidth filters (same center freq)

-ds Frame GMM computation downsampling ratio

-fdict word pronunciation dictionary input file

-feat Feature stream type, depends on the acoustic model

-featparams
containing feature extraction parameters.

-fillprob
Filler word transition probability

-frate Frame rate

-fsg format finite state grammar file

-fsgctl
file listing FSG file to use for each utterance

-fsgdir
directory for FSG files

-fsgext
extension for FSG files (including leading dot)

-fsgusealtpron
Add alternate pronunciations to FSG

-fsgusefiller
Insert filler words at each state.

-fwdflat
Run forward flat-lexicon search over word lattice (2nd pass)

-fwdflatbeam
Beam width applied to every frame in second-pass flat search

-fwdflatefwid
Minimum number of end frames for a word to be searched in fwdflat search

-fwdflatlw
Language model probability weight for flat lexicon (2nd pass) decoding

-fwdflatsfwin
Window of frames in lattice to search for successor words in fwdflat search

-fwdflatwbeam
Beam width applied to word exits in second-pass flat search

-fwdtree
Run forward lexicon-tree search (1st pass)

-hmm containing acoustic model files.

-hyp output file name

-hypseg
output with segmentation file name

-input_endian
Endianness of input data, big or little, ignored if NIST or MS Wav

-jsgf grammar file

-keyphrase
to spot

-kws file with keyphrases to spot, one per line

-kws_delay
Delay to wait for best detection score

-kws_plp
Phone loop probability for keyword spotting

-kws_threshold
Threshold for p(hyp)/p(alternatives) ratio

-latsize
Initial backpointer table size

-lda containing transformation matrix to be applied to features (single-stream features
only)

-ldadim
Dimensionality of output of feature transformation (0 to use entire matrix)

-lifter
Length of sin-curve for liftering, or 0 for no liftering.

-lm trigram language model input file

-lmctl a set of language model

The -hmm and -dict arguments are always required. Either -lm or -fsg is required,
depending on whether you are using a statistical language model or a finite-state grammar.
To do batchmode recognition, you will need to specify a control file, using -ctl This is a
simple text file containing one entry per line. Each entry is the name of an input file
relative to the -cepdir directory, and without the filename extension (which is given in
the -cepext argument).

If you are using acoustic feature files as input (see sphinx_fe(1) for information on how
to generate these), you can also specify a subpart of a file, using the following format:

FILENAME START-FRAME END-FRAME UTTERANCE-ID

Use pocketsphinx_batch online using onworks.net services