mmseg - Online in the Cloud

This is the command mmseg that can be run in the OnWorks free hosting provider using one of our multiple free online workstations such as Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

Run in Ubuntu Run in Fedora Run in Windows Sim Run in MACOS Sim

PROGRAM:

NAME

mmseg - maximum matching segment Chinese text.

SYNOPSIS

mmseg -d dict_file [option]... [corpus_file]...

DESCRIPTION

mmseg is a tool for segmenting Chinese text into words using maximum matching algorithm.
mmseg segments corpus_file, or standard input if no filename is specified, and write the
segmented result to standard output.

OPTIONS

-d dict_file
Use dict_file as lexicon. A default lexicon can be found at
/usr/share/sunpinyin-slm/dict.utf8.

-f,--format (text|bin)
Output Format, can be 'text' or 'bin'. default 'bin'. Normally, in text mode, word
text are output, while in binary mode, binary short integer of the word-ids are
written to stdout.

-s, --stok STOK_ID
Sentence token id. Default 10. It will be written to output in binary mode after
every sentence.

-i, --show-id
Show Id info. Under text output format mode, attach id after known words. If under
binary mode, print id(s) in text.

-a, --ambiguious-id AMBI-ID
Ambiguious means ABC => A BC or AB C. If specified (AMBI-ID != 0), The sequence ABC
will not be segmented, in binary mode, the AMBI-ID is written out; in text mode,
"<ambi>ABC</ambi>" will be output. Default is 0.

NOTES

Under binary mode, consecutive id of 0 are merged into one 0. Under text mode, no space
are inserted between unknown-words.

Use mmseg online using onworks.net services

Latest Linux & Windows online programs

Live USB BSD #flushyourmeds

LiveUSB BSD #flushyourmeds ; EOF. ;
Gayatri hi-Tech is THE Company. ; EOF. ;
Author : Scott Alan Barry ; EOF. ;
LiveUSB-BSD-flushyourmeds.img.7z ; An
OpenBSD ;...

Enter

dynamic-pv-scaler

A golang based Kubernetes application
which has been created to overcome the
scaling issue of Persistent Volume in
Kubernetes. This can scale the
Persistent Vo...

Enter

Agat Emulator

Apple ][ and Agat 7/9 software
simulation program for Win32/Linux.
Features:Full emulation of Agat-7,
Agat-9Full emulation of Apple ][, Apple
][ Plus, Apple ][...

Enter

MailCleaner

MailCleaner Open Source Edition is now
discontinued but will continue under the
spamtagger project
https://github.com/SpamTagger [antispam]
MailCleaner is an a...

Enter

StateOS

StateOS is a compact, open-source
real-time operating system designed for
embedded systems and written in ANSI C.
It supports cooperative and preemptive
multit...

Enter

odaba

ODABA is an terminology-oriented
database management system (TODBMS) on a
high conceptual level. It provides a
number of enhanced features based on
natural lan...

Enter

pg_easy_replicate

pg_easy_replicate is a simple tool for
replicating PostgreSQL data from one
database to another using logical
replication. It abstracts away the
complexity of ...

Enter

Brotli

Version 1.0.9 contains a fix to
"integer overflow" problem. This
happens when "one-shot" decoding
API is used (or input chunk for
streaming API...

Enter

Material Web

@material/web is a library of web
components that helps build beautiful
and accessible web applications. It uses
Material 3, the latest version of
Google's...

Enter

mmseg - Online in the Cloud

PROGRAM:

NAME

SYNOPSIS

DESCRIPTION

OPTIONS

NOTES

Latest Linux & Windows online programs

Categories to download Software & Programs for Windows & Linux