EnglishFrenchSpanish

Ad


OnWorks favicon

emmae - Online in the Cloud

Run emmae in OnWorks free hosting provider over Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

This is the command emmae that can be run in the OnWorks free hosting provider using one of our multiple free online workstations such as Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

PROGRAM:

NAME


emma - Multiple sequence alignment (ClustalW wrapper)

SYNOPSIS


emma -sequence seqall [-onlydend toggle] -dend toggle -dendfile infile [-slow toggle]
-pwmatrix list -pwdnamatrix list -usermatrix variable -pairwisedatafile infile
-matrix list -usermamatrix variable -dnamatrix list -umamatrix variable
-mamatrixfile infile -pwgapopen float -pwgapextend float -ktup integer -gapw integer
-topdiags integer -window integer -nopercent boolean [-gapopen float]
[-gapextend float] [-endgaps boolean] [-gapdist integer] -norgap boolean
-hgapres string -nohgap boolean [-maxdiv integer] -outseq seqoutset
-dendoutfile outfile

emma -help

DESCRIPTION


emma is a command line program from EMBOSS (“the European Molecular Biology Open Software
Suite”). It is part of the "Alignment:Multiple" command group(s).

OPTIONS


Input section
-sequence seqall

-onlydend toggle
Default value: N

-dend toggle
Default value: N

-dendfile infile

-slow toggle
A distance is calculated between every pair of sequences and these are used to
construct the dendrogram which guides the final multiple alignment. The scores are
calculated from separate pairwise alignments. These can be calculated using 2 methods:
dynamic programming (slow but accurate) or by the method of Wilbur and Lipman
(extremely fast but approximate). The slow-accurate method is fine for short sequences
but will be VERY SLOW for many (e.g. >100) long (e.g. >1000 residue) sequences.
Default value: Y

Pairwise align options
-pwmatrix list
The scoring table which describes the similarity of each amino acid to each other.
There are three 'in-built' series of weight matrices offered. Each consists of several
matrices which work differently at different evolutionary distances. To see the exact
details, read the documentation. Crudely, we store several matrices in memory,
spanning the full range of amino acid distance (from almost identical sequences to
highly divergent ones). For very similar sequences, it is best to use a strict weight
matrix which only gives a high score to identities and the most favoured conservative
substitutions. For more divergent sequences, it is appropriate to use 'softer'
matrices which give a high score to many other frequent substitutions. 1) BLOSUM
(Henikoff). These matrices appear to be the best available for carrying out data base
similarity (homology searches). The matrices used are: Blosum80, 62, 45 and 30. 2) PAM
(Dayhoff). These have been extremely widely used since the late '70s. We use the PAM
120, 160, 250 and 350 matrices. 3) GONNET . These matrices were derived using almost
the same procedure as the Dayhoff one (above) but are much more up to date and are
based on a far larger data set. They appear to be more sensitive than the Dayhoff
series. We use the GONNET 40, 80, 120, 160, 250 and 350 matrices. We also supply an
identity matrix which gives a score of 1.0 to two identical amino acids and a score of
zero otherwise. This matrix is not very useful. Default value: b

-pwdnamatrix list
The scoring table which describes the scores assigned to matches and mismatches
(including IUB ambiguity codes). Default value: i

-usermatrix variable

-pairwisedatafile infile

Matrix options
-matrix list
This gives a menu where you are offered a choice of weight matrices. The default for
proteins is the PAM series derived by Gonnet and colleagues. Note, a series is used!
The actual matrix that is used depends on how similar the sequences to be aligned at
this alignment step are. Different matrices work differently at each evolutionary
distance. There are three 'in-built' series of weight matrices offered. Each consists
of several matrices which work differently at different evolutionary distances. To see
the exact details, read the documentation. Crudely, we store several matrices in
memory, spanning the full range of amino acid distance (from almost identical
sequences to highly divergent ones). For very similar sequences, it is best to use a
strict weight matrix which only gives a high score to identities and the most favoured
conservative substitutions. For more divergent sequences, it is appropriate to use
'softer' matrices which give a high score to many other frequent substitutions. 1)
BLOSUM (Henikoff). These matrices appear to be the best available for carrying out
data base similarity (homology searches). The matrices used are: Blosum80, 62, 45 and
30. 2) PAM (Dayhoff). These have been extremely widely used since the late '70s. We
use the PAM 120, 160, 250 and 350 matrices. 3) GONNET . These matrices were derived
using almost the same procedure as the Dayhoff one (above) but are much more up to
date and are based on a far larger data set. They appear to be more sensitive than the
Dayhoff series. We use the GONNET 40, 80, 120, 160, 250 and 350 matrices. We also
supply an identity matrix which gives a score of 1.0 to two identical amino acids and
a score of zero otherwise. This matrix is not very useful. Alternatively, you can read
in your own (just one matrix, not a series). Default value: b

-usermamatrix variable

-dnamatrix list
This gives a menu where a single matrix (not a series) can be selected. Default value:
i

-umamatrix variable

-mamatrixfile infile

Additional section
Slow align options
-pwgapopen float
The penalty for opening a gap in the pairwise alignments. Default value: 10.0

-pwgapextend float
The penalty for extending a gap by 1 residue in the pairwise alignments. Default
value: 0.1

Fast align options
-ktup integer
This is the size of exactly matching fragment that is used. INCREASE for speed (max= 2
for proteins; 4 for DNA), DECREASE for sensitivity. For longer sequences (e.g. >1000
residues) you may need to increase the default. Default value: @($(acdprotein)?1:2)

-gapw integer
This is a penalty for each gap in the fast alignments. It has little affect on the
speed or sensitivity except for extreme values. Default value: @($(acdprotein)?3:5)

-topdiags integer
The number of k-tuple matches on each diagonal (in an imaginary dot-matrix plot) is
calculated. Only the best ones (with most matches) are used in the alignment. This
parameter specifies how many. Decrease for speed; increase for sensitivity. Default
value: @($(acdprotein)?5:4)

-window integer
This is the number of diagonals around each of the 'best' diagonals that will be used.
Decrease for speed; increase for sensitivity. Default value: @($(acdprotein)?5:4)

-nopercent boolean
Default value: N

Gap options
-gapopen float
The penalty for opening a gap in the alignment. Increasing the gap opening penalty
will make gaps less frequent. Default value: 10.0

-gapextend float
The penalty for extending a gap by 1 residue. Increasing the gap extension penalty
will make gaps shorter. Terminal gaps are not penalised. Default value: 5.0

-endgaps boolean
End gap separation: treats end gaps just like internal gaps for the purposes of
avoiding gaps that are too close (set by 'gap separation distance'). If you turn this
off, end gaps will be ignored for this purpose. This is useful when you wish to align
fragments where the end gaps are not biologically meaningful. Default value: Y

-gapdist integer
Gap separation distance: tries to decrease the chances of gaps being too close to each
other. Gaps that are less than this distance apart are penalised more than other gaps.
This does not prevent close gaps; it makes them less frequent, promoting a block-like
appearance of the alignment. Default value: 8

-norgap boolean
Residue specific penalties: amino acid specific gap penalties that reduce or increase
the gap opening penalties at each position in the alignment or sequence. As an
example, positions that are rich in glycine are more likely to have an adjacent gap
than positions that are rich in valine. Default value: N

-hgapres string
This is a set of the residues 'considered' to be hydrophilic. It is used when
introducing Hydrophilic gap penalties. Default value: GPSNDQEKR

-nohgap boolean
Hydrophilic gap penalties: used to increase the chances of a gap within a run (5 or
more residues) of hydrophilic amino acids; these are likely to be loop or random coil
regions where gaps are more common. The residues that are 'considered' to be
hydrophilic are set by '-hgapres'. Default value: N

-maxdiv integer
This switch, delays the alignment of the most distantly related sequences until after
the most closely related sequences have been aligned. The setting shows the percent
identity level required to delay the addition of a sequence; sequences that are less
identical than this level to any other sequences will be aligned later. Default value:
30

Output section
-outseq seqoutset

-dendoutfile outfile

Use emmae online using onworks.net services


Free Servers & Workstations

Download Windows & Linux apps

  • 1
    KompoZer
    KompoZer
    KompoZer is a wysiwyg HTML editor using
    the Mozilla Composer codebase. As
    Nvu's development has been stopped
    in 2005, KompoZer fixes many bugs and
    adds a f...
    Download KompoZer
  • 2
    Free Manga Downloader
    Free Manga Downloader
    The Free Manga Downloader (FMD) is an
    open source application written in
    Object-Pascal for managing and
    downloading manga from various websites.
    This is a mirr...
    Download Free Manga Downloader
  • 3
    UNetbootin
    UNetbootin
    UNetbootin allows you to create bootable
    Live USB drives for Ubuntu, Fedora, and
    other Linux distributions without
    burning a CD. It runs on Windows, Linux,
    and ...
    Download UNetbootin
  • 4
    Dolibarr ERP - CRM
    Dolibarr ERP - CRM
    Dolibarr ERP - CRM is an easy to use
    ERP and CRM open source software package
    (run with a web php server or as
    standalone software) for businesses,
    foundations...
    Download Dolibarr ERP - CRM
  • 5
    SQuirreL SQL Client
    SQuirreL SQL Client
    SQuirreL SQL Client is a graphical SQL
    client written in Java that will allow
    you to view the structure of a JDBC
    compliant database, browse the data in
    tables...
    Download SQuirreL SQL Client
  • 6
    Brackets
    Brackets
    Brackets is a free, modern open-source
    text editor made especially for Web
    Development. Written in HTML, CSS, and
    JavaScript with focused visual tools and
    prepr...
    Download Brackets
  • More »

Linux commands

Ad