EnglishFrenchSpanish

Ad


OnWorks favicon

ids2ngram - Online in the Cloud

Run ids2ngram in OnWorks free hosting provider over Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

This is the command ids2ngram that can be run in the OnWorks free hosting provider using one of our multiple free online workstations such as Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

PROGRAM:

NAME


ids2ngram - generate n-gram data file from ids file

SYNOPSIS


ids2ngram [option]... ids_file...

DESCRIPTION


ids2ngram generates idngram file, which is a sorted [id1,..,idN,freq] array, from binary
id stream files. Here, the id stream files are always generated by mmseg or slmseg.
Basically, it finds all occurrence of n-words tuples (i.e. the tuple of (id1,..,idN)), and
sorts these tuples by the lexicographic order of the ids make up the tuples, then write
them to specified output file.

INPUT


The input file is presented as a binary id stream, which looks like:
[id0,...,idX]

OPTIONS


All the following options are mandatory.

-n,--NMax N
Generates N-gram result. ids2ngram does only support uni-gram, bi-gram, and trigram,
so any number not in the range of 1..3 is not valid.

-s,--swap swap-file
Specify the temporary intermediate file.

-o, --out output-file
Specify the result idngram file, e.g. the array of [id1, ..., idN, freq]

-p, --para N
Specify the maximum n-gram items per paragraph. ids2ngram writes to the temporary file
on a per-paragraph basis. Every time it writes a paragraph out, it frees the
corresponding memory allocated for it. When your computer system permits, a higher N
is suggested. This can speed up the processing speed because of less I/O.

EXAMPLE


Following example will use three input idstream file idsfile[1,2,3] to generate the
idngram file all.id3gram. Each para (internal map size or hash size) would be 1024000,
using swap file for temp result. All temp para result would eventually be merged to got
the final result.

ids2ngram -n 3 -s /tmp/swap -o all.id3gram -p 1024000 idsfile1 idsfile2 idsfile3

Use ids2ngram online using onworks.net services


Free Servers & Workstations

Download Windows & Linux apps

  • 1
    VASSAL Engine
    VASSAL Engine
    VASSAL is a game engine for creating
    electronic versions of traditional board
    and card games. It provides support for
    game piece rendering and interaction,
    and...
    Download VASSAL Engine
  • 2
    OpenPDF - Fork of iText
    OpenPDF - Fork of iText
    OpenPDF is a Java library for creating
    and editing PDF files with a LGPL and
    MPL open source license. OpenPDF is the
    LGPL/MPL open source successor of iText,
    a...
    Download OpenPDF - Fork of iText
  • 3
    SAGA GIS
    SAGA GIS
    SAGA - System for Automated
    Geoscientific Analyses - is a Geographic
    Information System (GIS) software with
    immense capabilities for geodata
    processing and ana...
    Download SAGA GIS
  • 4
    Toolbox for Java/JTOpen
    Toolbox for Java/JTOpen
    The IBM Toolbox for Java / JTOpen is a
    library of Java classes supporting the
    client/server and internet programming
    models to a system running OS/400,
    i5/OS, o...
    Download Toolbox for Java/JTOpen
  • 5
    D3.js
    D3.js
    D3.js (or D3 for Data-Driven Documents)
    is a JavaScript library that allows you
    to produce dynamic, interactive data
    visualizations in web browsers. With D3
    you...
    Download D3.js
  • 6
    Shadowsocks
    Shadowsocks
    A fast tunnel proxy that helps you
    bypass firewalls This is an application
    that can also be fetched from
    https://sourceforge.net/projects/shadowsocksgui/.
    It ha...
    Download Shadowsocks
  • More »

Linux commands

Ad