EnglishFrenchSpanish

Ad


OnWorks favicon

meryl - Online in the Cloud

Run meryl in OnWorks free hosting provider over Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

This is the command meryl that can be run in the OnWorks free hosting provider using one of our multiple free online workstations such as Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

PROGRAM:

NAME


meryl - in- and out-of-core kmer counting and utilities

SYNOPSIS


Estimating memory requirements
meryl -P -m kmersize [-c #] [-p] -s seq.fasta

meryl -P -m kmersize [-c #] [-p] -n mercount

Building a table
meryl -B -m kmersize [-c #] [-p] [-v] [-f|-r|-C] [-L minoccurrence] [-U maxoccurrence]
[-threads n | {-segments segments | -memory megabytes} [-configbatch [-sge jobname]]]
-s seq.fasta -o tblprefix

meryl -countbatch number [-sgebuild "qsuboptionstring"] -o tblprefix

meryl -mergebatch number [-sgemerge "qsuboptionstring"] -o tblprefix

Performing operations on a table
meryl -M operation [-v] -s tblprefix [-s tblprefix2 ...] -o output

Dumping a table
meryl -Dh -s tblprefix

meryl -Dt -n mincount -s tblprefix

DESCRIPTION


meryl computes the kmer content of genomic sequences. Kmer content is represented as a
list of kmers and the number of times each occurs in the input sequences. The kmer can be
restricted to only the forward kmer, only the reverse kmer, or the canonical kmer
(lexicographically smaller of the forward and reverse kmer at each location). Meryl can
report the histogram of counts, the list of kmers and their counts, or can perform
mathematical and set operations on the processed data files.

The output of meryl is two binary files, called a meryl database, which can be quickly
dumped to provide a histogram of counts, or the actual counts. A C++ library is supplied
for direct access to the files.

OPTIONS


-P Estimate memory requirements. Given a sequence file (-s) or an upper limit on the
number of mers in the file (-n), compute the table size (-t in build) to minimize
the memory usage. This mode recognizes the following options:

-m # size of a mer (required)

-c # homopolymer compression (optional)

-p enable positions

-s seq.fasta
Sequence file to be scanned to determine the number of mers

-n # compute params assuming file with this many mers in it

Only one of -s, -n need to be specified. If both are given, -s takes priority.

-B Compute the mer-count tables given a sequence file (-s) and lots of parameters. By
default, both strands are processed.

-f only build for the forward strand

-r only build for the reverse strand

-C use canonical mers (assumes both strands)

-L # DON'T save mers that occur less than # times

-U # DON'T save mers that occur more than # times

-m # size of a mer (required)

-c # homopolymer compression (optional)

-p enable positions

-s seq.fasta
sequence to build the table for

-o tblprefix
output table prefix

-v entertain the user

The meryl process can run in one large memory batch, in many small memory batches,
or under SGE control, all with or without using multiple CPU cores. By default,
the computation is done as one large sequential process. Multi-threaded operation
is possible, at additional memory expense, as is segmented operation, at additional
I/O expense.

Threaded operation
Split the counting in to n almost-equally sized pieces. This uses an extra
h MB (from -P) per thread.

-threads n
use n threads to build

Segmented, sequential operation
Split the counting into pieces that will fit into no more than m MB of
memory, or into n equal sized pieces. Each piece is computed sequentially,
and the results are merged at the end. Only one of -memory and -segments is
needed.

-memory m
use at most m MB of memory per segment

-segments n
use n segments

Segmented, batched operation
Same as sequential, except this allows each segment to be manually executed
in parallel. Only one of -memory and -segments is needed. Also see the
EXAMPLE section on this page.

-memory m
use at most m MB of memory per segment

-segments n
use n segments

-configbatch
create the batches

-countbatch n
run batch number n

-mergebatch
merge the batches

Batched mode can run on the grid.

-sge jobname
unique job name for this execution. Meryl will submit jobs with name
mpjobname, ncjobname, nmjobname, for phases prepare, count and merge.

-sgebuild "options"

-sgemerge "options"
any additional options to qsub(1) (e.g., "-p -153 -pe thread 2 -A
merylaccount") N.B. - -N will be ignored N.B. - be sure to quote the
options

-M Given a list of tables, perform a math, logical or threshold operation. Unless
specified, all operations take any number of databases. Math operations are:

min count is the minimum count for all databases. If the mer does NOT exist in
all databases, the mer has a zero count, and is NOT in the output.

minexist
count is the minimum count for all databases that contain the mer

max count is the maximum count for all databases

add count is sum of the counts for all databases

sub count is the first minus the second (binary only)

abs count is the absolute value of the first minus the second (binary only)

Logical operations are:

and outputs mer iff it exists in all databases

nand outputs mer iff it exists in at least one, but not all, databases

or outputs mer iff it exists in at least one database

xor outputs mer iff it exists in an odd number of databases

Threshold operations are:

lessthan x
outputs mer iff it has count < x

lessthanorequal x
outputs mer iff it has count <= x

greaterthan x
outputs mer iff it has count > x

greaterthanorequal x
outputs mer iff it has count >= x

equal x
outputs mer iff it has count == x

Threshold operations work on exactly one database.

-s tblprefix
use tblprefix as a database

-o tblprefix
create this output

-v entertain the user

-D Dump table (not all of these work)

-Dd Dump a histogram of the distance between the same mers.

-Dt Dump mers >= a threshold. Use -n to specify the threshold.

-Dc Count the number of mers, distinct mers and unique mers.

-Dh Dump (to stdout) a histogram of mer counts.

-s Read the count table from here (leave off the .mcdat or .mcidx).

EXAMPLE


Batch creation of a table
Initialize the compute with -configbatch, which needs all the build options. Execute all
-countbatch jobs, then -mergebatch to complete.

meryl -configbatch -B [options] -o file
meryl -countbatch 0 -o file
meryl -countbatch 1 -o file
...
meryl -countbatch N -o file
meryl -mergebatch N -o file

Use meryl online using onworks.net services


Free Servers & Workstations

Download Windows & Linux apps

  • 1
    AstrOrzPlayer
    AstrOrzPlayer
    AstrOrz Player is a free media player
    software, part based on WMP and VLC. The
    player is in a minimalist style, with
    more than ten theme colors, and can also
    b...
    Download AstrOrzPlayer
  • 2
    movistartv
    movistartv
    Kodi Movistar+ TV es un ADDON para XBMC/
    Kodi que permite disponer de un
    decodificador de los servicios IPTV de
    Movistar integrado en uno de los
    mediacenters ma...
    Download movistartv
  • 3
    Code::Blocks
    Code::Blocks
    Code::Blocks is a free, open-source,
    cross-platform C, C++ and Fortran IDE
    built to meet the most demanding needs
    of its users. It is designed to be very
    extens...
    Download Code::Blocks
  • 4
    Amidst
    Amidst
    Amidst or Advanced Minecraft Interface
    and Data/Structure Tracking is a tool to
    display an overview of a Minecraft
    world, without actually creating it. It
    can ...
    Download Amidst
  • 5
    MSYS2
    MSYS2
    MSYS2 is a collection of tools and
    libraries providing you with an
    easy-to-use environment for building,
    installing and running native Windows
    software. It con...
    Download MSYS2
  • 6
    libjpeg-turbo
    libjpeg-turbo
    libjpeg-turbo is a JPEG image codec
    that uses SIMD instructions (MMX, SSE2,
    NEON, AltiVec) to accelerate baseline
    JPEG compression and decompression on
    x86, x8...
    Download libjpeg-turbo
  • More »

Linux commands

  • 1
    abi-tracker
    abi-tracker
    abi-tracker - visualize ABI changes
    timeline of a C/C++ software library.
    DESCRIPTION: NAME: ABI Tracker
    (abi-tracker) Visualize ABI changes
    timeline of a C/C+...
    Run abi-tracker
  • 2
    abicheck
    abicheck
    abicheck - check application binaries
    for calls to private or evolving symbols
    in libraries and for static linking of
    some system libraries. ...
    Run abicheck
  • 3
    couriermlm
    couriermlm
    couriermlm - The Courier mailing list
    manager ...
    Run couriermlm
  • 4
    couriertcpd
    couriertcpd
    couriertcpd - the Courier mail server
    TCP server daemon ...
    Run couriertcpd
  • 5
    gbklatex
    gbklatex
    bg5latex - Use LaTeX directly on a Big5
    encodedtex file bg5pdflatex - Use
    pdfLaTeX directly on a Big5 encodedtex
    file bg5+latex - Use LaTeX directly on a
    Big5+...
    Run gbklatex
  • 6
    gbkpdflatex
    gbkpdflatex
    bg5latex - Use LaTeX directly on a Big5
    encodedtex file bg5pdflatex - Use
    pdfLaTeX directly on a Big5 encodedtex
    file bg5+latex - Use LaTeX directly on a
    Big5+...
    Run gbkpdflatex
  • More »

Ad