hmmalign - align sequences to a profile HMM


hmmalign [options] <hmmfile> <seqfile>


Perform a multiple sequence alignment of all the sequences in <seqfile> by aligning them
individually to the profile HMM in <hmmfile>. The new alignment is output to stdout in
Stockholm format.

The <hmmfile> should contain only a single profile. If it contains more, only the first
profile in the file will be used.

Either <hmmfile> or <seqfile> (but not both) may be '-' (dash), which means reading this
input from stdin rather than a file.

The sequences in <seqfile> are aligned in unihit local alignment mode. Therefore they
should already be known to contain only a single domain (or a fragment of one). The
optimal alignment may assign some residues as nonhomologous (N and C states), in which
case these residues are still included in the resulting alignment, but shoved to the outer
edges. To trim these unaligned nonhomologous residues from the result, see the --trim


-h Help; print a brief reminder of command line usage and all available options.

-o <f> Direct the output alignment to file <f>, rather than to stdout.

--mapali <f>
Merge the existing alignment in file <f> into the result, where <f> is exactly the
same alignment that was used to build the model in <hmmfile>. This is done using a
map of alignment columns to consensus profile positions that is stored in the
<hmmfile>. The multiple alignment in <f> will be exactly reproduced in its
consensus columns (as defined by the profile), but the displayed alignment in
insert columns may be altered, because insertions relative to a profile are
considered by convention to be unaligned data.

--trim Trim nonhomologous residues (assigned to N and C states in the optimal alignments)
from the resulting multiple alignment output.

Specify that all sequences in <seqfile> are proteins. By default, alphabet type is
autodetected from looking at the residue composition.

--dna Specify that all sequences in <seqfile> are DNAs.

--rna Specify that all sequences in <seqfile> are RNAs.

--informat <s>
Declare that the input <seqfile> is in format <s>. Accepted sequence file formats
include FASTA, EMBL, GenBank, DDBJ, UniProt, Stockholm, and SELEX. Default is to
autodetect the format of the file.

--outformat <s>
Specify that the output multiple alignment is in format <s>. Currently the
accepted multiple alignment sequence file formats only include Stockholm and SELEX.

