EnglishFrenchSpanish

OnWorks favicon

leaff - Online in the Cloud

Run leaff in OnWorks free hosting provider over Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

This is the command leaff that can be run in the OnWorks free hosting provider using one of our multiple free online workstations such as Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

PROGRAM:

NAME


leaff - sequence library utilities and applications

SYNOPSIS


leaff [-f fasta-file] [options]

DESCRIPTION


LEAFF (Let's Extract Anything From Fasta) is a utility program for working with multi-
fasta files. In addition to providing random access to the base level, it includes several
analysis functions.

OPTIONS


SOURCE FILES
-f file: use sequence in 'file' (-F is also allowed for historical reasons)
-A file: read actions from 'file'

SOURCE FILE EXAMINATION
-d: print the number of sequences in the fasta
-i name: print an index, labelling the source 'name'

OUTPUT OPTIONS
-6 <#>: insert a newline every 60 letters
(if the next arg is a number, newlines are inserted every
n letters, e.g., -6 80. Disable line breaks with -6 0,
or just don't use -6!)
-e beg end: Print only the bases from position 'beg' to position 'end'
(space based, relative to the FORWARD sequence!) If
beg == end, then the entire sequence is printed. It is an
error to specify beg > end, or beg > len, or end > len.
-ends n Print n bases from each end of the sequence. One input
sequence generates two output sequences, with '_5' or '_3'
appended to the ID. If 2n >= length of the sequence, the
sequence itself is printed, no ends are extracted (they
overlap).
-C: complement the sequences
-H: DON'T print the defline
-h: Use the next word as the defline ("-H -H" will reset to the
original defline
-R: reverse the sequences
-u: uppercase all bases

SEQUENCE SELECTION
-G n s l: print n randomly generated sequences, 0 < s <= length <= l
-L s l: print all sequences such that s <= length < l
-N l h: print all sequences such that l <= % N composition < h
(NOTE 0.0 <= l < h < 100.0)
(NOTE that you cannot print sequences with 100% N
This is a useful bug).
-q file: print sequences from the seqid list in 'file'
-r num: print 'num' randomly picked sequences
-s seqid: print the single sequence 'seqid'
-S f l: print all the sequences from ID 'f' to 'l' (inclusive)
-W: print all sequences (do the whole file)

LONGER HELP
-help analysis
-help examples

ANALYSIS FUNCTIONS
--findduplicates a.fasta
Reports sequences that are present more than once. Output
is a list of pairs of deflines, separated by a newline.

--mapduplicates a.fasta b.fasta
Builds a map of IIDs from a.fasta and b.fasta that have
identical sequences. Format is "IIDa <-> IIDb"

--md5 a.fasta:
Don't print the sequence, but print the md5 checksum
(of the entire sequence) followed by the entire defline.

--partition prefix [ n[gmk]bp | n ] a.fasta
--partitionmap [ n[gmk]bp | n ] a.fasta
Partition the sequences into roughly equal size pieces of
size nbp, nkbp, nmbp or ngbp; or into n roughly equal sized
parititions. Sequences larger that the partition size are
in a partition by themself. --partitionmap writes a
description of the partition to stdout; --partiton creates
a fasta file 'prefix-###.fasta' for each partition.
Example: -F some.fasta --partition parts 130mbp
-F some.fasta --partition parts 16

--segment prefix n a.fasta
Splits the sequences into n files, prefix-###.fasta.
Sequences are not reordered; the first n sequences are in
the first file, the next n in the second file, etc.

--gccontent a.fasta
Reports the GC content over a sliding window of
3, 5, 11, 51, 101, 201, 501, 1001, 2001 bp.

--testindex a.fasta
Test the index of 'file'. If index is up-to-date, leaff
exits successfully, else, leaff exits with code 1. If an
index file is supplied, that one is tested, otherwise, the
default index file name is used.

--dumpblocks a.fasta
Generates a list of the blocks of N and non-N. Output
format is 'base seq# beg end len'. 'N 84 483 485 2' means
that a block of 2 N's starts at space-based position 483
in sequence ordinal 84. A '.' is the end of sequence
marker.

--errors L N C P a.fasta
For every sequence in the input file, generate new
sequences including simulated sequencing errors.
L -- length of the new sequence. If zero, the length
of the original sequence will be used.
N -- number of subsequences to generate. If L=0, all
subsequences will be the same, and you should use
C instead.
C -- number of copies to generate. Each of the N
subsequences will have C copies, each with different
errors.
P -- probability of an error.

HINT: to simulate ESTs from genes, use L=500, N=10, C=10
-- make C=10 sequencer runs of N=10 EST sequences
of length 500bp each.
to simulate mRNA from genes, use L=0, N=10, C=10
to simulate reads from genomes, use L=800, N=10, C=1
-- of course, N= should be increased to give the
appropriate depth of coverage

--stats a.fasta
Reports size statistics; number, N50, sum, largest.

--seqstore out.seqStore
Converts the input file (-f) to a seqStore file (for instance,
for use with the Celera assembler or sim4db).

NOTES


Please note that options are ORDER DEPENDENT. Sequences are printed whenever a SEQUENCE
SELECTION option occurs on the command line. OUTPUT OPTIONS are not reset when a sequence
is printed.

SEQUENCES are numbered starting at ZERO, not one!

EXAMPLES


1. Print the first 10 bases of the fourth sequence in file 'genes':
leaff -f genes -e 0 10 -s 3

2. Print the first 10 bases of the fourth and fifth sequences:
leaff -f genes -e 0 10 -s 3 -s 4

3. Print the fourth and fifth sequences reverse complemented, and the sixth
sequence forward. The second set of -R -C toggle off reverse-complement:
leaff -f genes -R -C -s 3 -s 4 -R -C -s 5

4. Convert file 'genes' to a seqStore 'genes.seqStore'.
leaff -f genes --seqstore genes.seqStore

Use leaff online using onworks.net services


Free Servers & Workstations

Download Windows & Linux apps

  • 1
    facetracknoir
    facetracknoir
    Modular headtracking program that
    supports multiple face-trackers, filters
    and game-protocols. Among the trackers
    are the SM FaceAPI, AIC Inertial Head
    Tracker ...
    Download facetracknoir
  • 2
    PHP QR Code
    PHP QR Code
    PHP QR Code is open source (LGPL)
    library for generating QR Code,
    2-dimensional barcode. Based on
    libqrencode C library, provides API for
    creating QR Code barc...
    Download PHP QR Code
  • 3
    Freeciv
    Freeciv
    Freeciv is a free turn-based
    multiplayer strategy game, in which each
    player becomes the leader of a
    civilization, fighting to obtain the
    ultimate goal: to bec...
    Download Freeciv
  • 4
    Cuckoo Sandbox
    Cuckoo Sandbox
    Cuckoo Sandbox uses components to
    monitor the behavior of malware in a
    Sandbox environment; isolated from the
    rest of the system. It offers automated
    analysis o...
    Download Cuckoo Sandbox
  • 5
    LMS-YouTube
    LMS-YouTube
    Play YouTube video on LMS (porting of
    Triode's to YouTbe API v3) This is
    an application that can also be fetched
    from
    https://sourceforge.net/projects/lms-y...
    Download LMS-YouTube
  • 6
    Windows Presentation Foundation
    Windows Presentation Foundation
    Windows Presentation Foundation (WPF)
    is a UI framework for building Windows
    desktop applications. WPF supports a
    broad set of application development
    features...
    Download Windows Presentation Foundation
  • More »

Linux commands

Ad