kmer-mask - Online in the Cloud

This is the command kmer-mask that can be run in the OnWorks free hosting provider using one of our multiple free online workstations such as Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

PROGRAM:

NAME


kmer-mask - mask and filter set of nucleotide sequences by kmer content

SYNOPSIS


kmer-mask {-novel|-confirmed} [-mdb mer-database] [-ms mer-size] [-edb exist-database] [-m
min-size] [-e extend-size] [-lowthreshold l] [-highthreshold h] [-t threads] [-v] [-h
histogram] [-promote|-demote|-discard] -1 in.1.fastq [-2 in.2.fastq] -o output-prefix

DESCRIPTION


Mask and filter set of sequences (presumed to be reads) by kmer content. Masking can be
done to retain novel sequence not in the database, or to retain confirmed sequence present
in the database. Filtering will segregate sequences fully, partially or not masked.

OPTIONS


-mdb mer-database
load masking kmers from meryl(1) mer-database

-ms mer-size

-edb exist-database
save masking kmers to an existDB(1) file exist-database for faster restarts

-1 in.1.fastq

-2 in.2.fastq
input reads files in fastq, fastq.gz, fastq.bz2 or fastq.xz format. The second is
optional, but messes up the output classification if not present.

-o out
prefix for output reads

out.fullymasked.[12].fastq
reads with below 'lowthreshold' bases retained

out.partiallymasked.[12].fastq
reads in between

out.retained.[12].fastq
reads with more than 'hightreshold' bases retained

out.discarded.[12].fastq
reads with conflicting status

-m min-size
ignore database hits below this many consecutive kmers (0)

-e extend-size
extend database hits across this many missing kmers (0)

-novel RETAIN novel sequence not present in the database

-confirmed
RETAIN confirmed sequence present in the database

-promote
promote the lesser RETAINED read to the status of the more RETAINED read
read1=fullymasked and read2=partiallymasked -> both are partiallymasked

-demote
demote the more RETAINED read to the status of the lesser RETAINED read
read1=fullymasked and read2=partiallymasked -> both are fullymasked

-discard
discard pairs with conflicting status (DEFAULT) read1=fullymasked and
read2=partiallymasked -> both are discarded

stats on stderr, number of sequences with amount RETAINED:
-lowthreshold t
(0.3333)

-highthreshold t
(0.6667)

-h histogram
write a histogram of the amount of sequence RETAINED

-t t use t compute threads

-v show progress

Use kmer-mask online using onworks.net services



Latest Linux & Windows online programs