fastq-mcf - Online in the Cloud

Run fastq-mcf in OnWorks free hosting provider over Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

This is the command fastq-mcf that can be run in the OnWorks free hosting provider using one of our multiple free online workstations such as Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

Run in Ubuntu Run in Fedora Run in Windows Sim Run in MACOS Sim

PROGRAM:

NAME

fastq-mcf - ea-utils: detect levels of adapter presence, compute likelihoods and locations
of the adapters

SYNOPSIS

fastq-mcf [options] <adapters.fa> <reads.fq> [mates1.fq ...]

DESCRIPTION

Version: 1.04.676

Detects levels of adapter presence, computes likelihoods and locations (start, end) of the
adapters. Removes the adapter sequences from the fastq file(s).

Stats go to stderr, unless -o is specified.

Specify -0 to turn off all default settings

If you specify multiple 'paired-end' inputs, then a -o option is required for each. IE:
-o read1.clip.q -o read2.clip.fq

OPTIONS

-h This help

-o FIL Output file (stats to stdout)

-s N.N Log scale for adapter minimum-length-match (2.2)

-t N % occurance threshold before adapter clipping (0.25)

-m N Minimum clip length, overrides scaled auto (1)

-p N Maximum adapter difference percentage (10)

-l N Minimum remaining sequence length (19)

-L N Maximum remaining sequence length (none)

-D N Remove duplicate reads : Read_1 has an identical N bases (0)

-k N sKew percentage-less-than causing cycle removal (2)

-x N 'N' (Bad read) percentage causing cycle removal (20)

-q N quality threshold causing base removal (10)

-w N window-size for quality trimming (1)

-H remove >95% homopolymer reads (no)

-X remove low complexity reads (no)

-0 Set all default parameters to zero/do nothing

-U|u Force disable/enable Illumina PF filtering (auto)

-P N Phred-scale (auto)

-R Don't remove N's from the fronts/ends of reads

-n Don't clip, just output what would be done

-C N Number of reads to use for subsampling (300k)

-S Save all discarded reads to '.skip' files

-d Output lots of random debugging stuff

Quality adjustment options:
--cycle-adjust
CYC,AMT Adjust cycle CYC (negative = offset from end) by amount AMT

--phred-adjust
SCORE,AMT Adjust score SCORE by amount AMT

--phred-adjust-max
SCORE Adjust scores > SCORE to SCOTE

Filtering options*:
--[mate-]qual-mean
NUM Minimum mean quality score

--[mate-]qual-gt
NUM,THR At least NUM quals > THR

--[mate-]max-ns
NUM Maxmium N-calls in a read (can be a %)

--[mate-]min-len
NUM Minimum remaining length (same as -l)

--homopolymer-pct
PCT Homopolymer filter percent (95)

--lowcomplex-pct
PCT Complexity filter percent (95)

If mate- prefix is used, then applies to second non-barcode read only

Adapter files are 'fasta' formatted:

Specify n/a to turn off adapter clipping, and just use filters

Increasing the scale makes recognition-lengths longer, a scale of 100 will force
full-length recognition of adapters.

Adapter sequences with _5p in their label will match 'end's, and sequences with _3p in
their label will match 'start's, otherwise the 'end' is auto-determined.

Skew is when one cycle is poor, 'skewed' toward a particular base. If any nucleotide is
less than the skew percentage, then the whole cycle is removed. Disable for methyl-seq,
etc.

Set the skew (-k) or N-pct (-x) to 0 to turn it off (should be done for miRNA, amplicon
and other low-complexity situations!)

Duplicate read filtering is appropriate for assembly tasks, and never when read length <
expected coverage. -D 50 will use 4.5GB RAM on 100m DNA reads - be careful. Great for RNA
assembly.

*Quality filters are evaluated after clipping/trimming

Homopolymer filtering is a subset of low-complexity, but will not be separately tracked
unless both are turned on.

Use fastq-mcf online using onworks.net services