                                  ehmmbuild



Wiki

   The master copies of EMBOSS documentation are available at
   http://emboss.open-bio.org/wiki/Appdocs on the EMBOSS Wiki.

   Please help by correcting and extending the Wiki pages.

Function

   Build a profile HMM from an alignment

Description

   EMBASSY HMMER is a suite of application wrappers to the original hmmer
   v2.3.2 applications written by Sean Eddy. hmmer v2.3.2 must be
   installed on the same system as EMBOSS and the location of the hmmer
   executables must be defined in your path for EMBASSY HMMER to work.

   Usage:
   ehmmbuild [options] alignfile hmmfile

   Important note: the alignfile (input) and hmmfile (output) parameters
   are specified in the reverse order in the original HMMER.

   hmmbuild reads a multiple sequence alignment file , builds a new
   profile HMM, and saves the HMM to file . By default, the model is
   confgured to find one or more nonoverlapping alignments to the complete
   model: multiple global alignments with respect to the model, and local
   with respect to the sequence. This is analogous to the behavior of the
   hmmls program of HMMER 1. To confgure the model for multiple local
   alignments with respect to the model and local with respect to the
   sequence, a la the old program hmmfs, use the -f (fragment) option.
   More rarely, you may want to confgure the model for a single global
   alignment (global with respect to both model and sequence), using the
   -g option; or to confgure the model for a single local/local alignment
   (a la standard Smith/Waterman, or the old hmmsw program), use the -s
   option.

Algorithm

   Please read the Userguide.pdf distributed with the original HMMER and
   included in the EMBASSY HMMER distribution under the DOCS directory.

Usage

   Here is a sample session with ehmmbuild


% ehmmbuild globins50.msf globin.hmm -nhmm globins50 -strategy D
Build a profile HMM from an alignment.

hmmbuild - build a hidden Markov model from an alignment
HMMER 2.3.2 (Oct 2003)
Copyright (C) 1992-2003 HHMI/Washington University School of Medicine
Freely distributed under the GNU General Public License (GPL)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Alignment file:                    ../../data/hmmnew/globins50.msf
File format:                       MSF
Search algorithm configuration:    Multiple domain (hmmls)
Model construction strategy:       MAP (gapmax hint: 0.50)
Null model used:                   (default)
Prior used:                        (default)
Sequence weighting method:         G/S/C tree weights
New HMM file:                      globin.hmm [appending]
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Alignment:           #1
Number of sequences: 50
Number of columns:   308

Determining effective sequence number    ... done. [2]
Weighting sequences heuristically        ... done.
Constructing model architecture          ... done.
Converting counts to probabilities       ... done.
Setting model name, etc.                 ... done. [globins50]

Constructed a profile HMM (length 143)
Average score:      189.04 bits
Minimum score:      -17.62 bits
Maximum score:      234.09 bits
Std. deviation:      53.18 bits

Finalizing model configuration           ... done.
Saving model to file                     ... done.
//


/shared/software/bin/hmmbuild -n globins50  --pbswitch 1000  --archpri 0.850000
 --idlevel 0.620000  --swentry 0.500000  --swexit 0.500000  --wgsc  -A -F  globi
n.hmm ../../data/hmmnew/globins50.msf


   Go to the input files for this example
   Go to the output files for this example

Command line arguments

   Where possible, the same command-line qualifier names and parameter
   order is used as in the original hmmer. There are however several
   unavoidable differences and these are clearly documented in the "Notes"
   section below.

   More or less all options documented as "expert" in the original hmmer
   user guide are given in ACD as "advanced" options (-options must be
   specified on the command-line in order to be prompted for a value for
   them).

Build a profile HMM from an alignment.
Version: EMBOSS:6.6.0.0

   Standard (Mandatory) qualifiers:
  [-alignfile]         seqset     (Aligned) protein sequence set filename and
                                  optional format, or reference (input USA)
   -nhmm               string     Name for this HMM. The name can be any
                                  string of non-whitespace characters (e.g.
                                  one 'word'). There is no length limit (at
                                  least not one imposed by HMMER; your shell
                                  will complain about command line lengths
                                  first). (Any word)
   -strategy           menu       [D] All alignments are local with respect to
                                  the sequence and are configured to be local
                                  (fragmentary) or global with respect to the
                                  HMM. The model is also configured to find a
                                  single or multiple domains (matches) to a
                                  sequence. The options for configuring the
                                  model are as follows: (D): The default
                                  setting. Multiple domains per sequence,
                                  global alignments with respect to the HMM.
                                  (F): Multiple domains per sequence, local
                                  alignments with respect to the HMM.
                                  Analogous to the old hmmfs program of HMMER
                                  1. (G) Single domain per sequence, global
                                  alignment with respect to the HMM. Analogous
                                  to the old hmms program of HMMER 1. (S)
                                  Single domain per sequence, local alignments
                                  with respect to the HMM. Analogous to the
                                  old hmmsw program of HMMER 1. (Values: D
                                  (global-multidomain); F (local-multidomain);
                                  G (global-singledomain); S
                                  (local-singledomain))
  [-hmmfile]           outfile    [*.ehmmbuild] HMMER hidden markov model
                                  output file

   Additional (Optional) qualifiers: (none)
   Advanced (Unprompted) qualifiers:
   -prior              infile     Read a Dirichlet prior from file, replacing
                                  the default mixture Dirichlet. The format of
                                  prior files is documented in the User's
                                  Guide, and an example is given in the Demos
                                  directory of the HMMER distribution.
   -null               infile     Read a null model from file. The default for
                                  protein is to use average amino acid
                                  frequencies from Swissprot 34 and p1 =
                                  350/351; for nucleic acid, the default is to
                                  use 0.25 for each base and p1 = 1000/1001.
                                  For documentation of the format of the null
                                  model file and further explanation of how
                                  the null model is used, see the User's
                                  Guide.
   -pam                infile     Apply a heuristic PAM- (substitution
                                  matrix-) based prior on match emission
                                  probabilities instead of the default mixture
                                  Dirichlet. The substitution matrix is read
                                  from file. See -pamwgt. The default
                                  Dirichlet state transition prior and insert
                                  emission prior are unaffected. Therefore in
                                  principle you could combine -prior with -pam
                                  but this isn't recommended, as it hasn't
                                  been tested. ( -pam itself hasn't been
                                  tested much!)
   -pamwgt             float      [20.0] Controls the weight  on a
                                  PAM-based prior. Only has effect if -pam
                                  option is also in use.  is a positive
                                  real number, 20.0 by default.  is the
                                  number of 'pseudocounts' contriubuted by the
                                  heuristic prior. Very high values of
                                  can force a scoring system that is entirely
                                  driven by the substitution matrix, making
                                  HMMER somewhat approximate Gribskov
                                  profiles. (Any numeric value)
   -pbswitch           integer    [1000] For alignments with a very large
                                  number of sequences, the GSC, BLOSUM, and
                                  Voronoi weighting schemes are slow; they're
                                  O(N^2) for N sequences. Henikoff
                                  position-based weights (PB weights) are more
                                  effcient. At or above a certain threshold
                                  sequence number  hmmbuild will switch
                                  from GSC, BLOSUM, or Voronoi weights to PB
                                  weights. To disable this switching behavior
                                  (at the cost of compute time, set  to be
                                  something larger than the number of
                                  sequences in your alignment.  is a
                                  positive integer; the default is 1000. (Any
                                  integer value)
   -archpri            float      [0.85] The value of the 'architecture prior'
                                  used by MAP architecture construction. This
                                  value is a probability between 0 and 1.
                                  This parameter governs a geometric prior
                                  distribution over model lengths. As
                                  'archpri' increases, longer models are
                                  favored a priori. As 'archpri' decreases, it
                                  takes more residue conservation in a column
                                  to make a column a 'consensus' match column
                                  in the model architecture. The 0.85 default
                                  has been chosen empirically as a reasonable
                                  setting. (Any numeric value)
   -binary             boolean    [N] Write the HMM to file in HMMER binary
                                  format instead of readable ASCII text.
   -fast               boolean    [N] Quickly and heuristically determine the
                                  architecture of the model by assigning all
                                  columns with more than a certain fraction of
                                  gap characters to insert states. By default
                                  this fraction is 0.5, and it can be changed
                                  using the --gapmax option. The default
                                  construction algorithm is a maximum a
                                  posteriori (MAP) algorithm, which is slower.
   -gapmax             float      [0.5] Controls the -fast model construction
                                  algorithm, but if -fast is not being used,
                                  has no effect. If a column has more than a
                                  fraction  of gap symbols in it, it gets
                                  assigned to an insert column.  is a
                                  frequency from 0 to 1, and by default is set
                                  to 0.5. Higher values of  mean more
                                  columns get assigned to consensus, and
                                  models get longer; smaller values of
                                  mean fewer columns get assigned to
                                  consensus, and models get smaller. (Any
                                  numeric value)
   -hand               boolean    [N] Specify the architecture of the model by
                                  hand: the alignment file must be in SELEX
                                  or Stockholm format, and the reference
                                  annotation line (RF in SELEX, GC RF in
                                  Stockholm) is used to specify the
                                  architecture. Any column marked with a
                                  non-gap symbol (such as an 'x', for
                                  instance) is assigned as a consensus (match)
                                  column in the model.
   -sidlevel           float      [0.62] Controls both the determination of
                                  effective sequence number and the behavior
                                  of the -wblosum weighting option. The
                                  sequence alignment is clustered by percent
                                  identity, and the number of clusters at a
                                  cutoff threshold of  is used to determine
                                  the effective sequence number. Higher
                                  values of  give more clusters and higher
                                  effective sequence numbers; lower values of
                                   give fewer clusters and lower effective
                                  sequence numbers.  is a fraction from 0
                                  to 1, and by default is set to 0.62
                                  (corresponding to the clustering level used
                                  in constructing the BLOSUM62 substitution
                                  matrix). (Any numeric value)
   -noeff              boolean    [N] Turn off the effective sequence number
                                  calculation, and use the true number of
                                  sequences instead. This will usually reduce
                                  the sensitivity of the final model (so don't
                                  do it without good reason!)
   -swentry            float      [0.5] Controls the total probability that is
                                  distributed to local entries into the
                                  model, versus starting at the beginning of
                                  the model as in a global alignment.  is a
                                  probability from 0 to 1, and by default is
                                  set to 0.5. Higher values of  mean that
                                  hits that are fragments on their left (N or
                                  5'-terminal) side will be penalized less,
                                  but complete global alignments will be
                                  penalized more. Lower values of  mean
                                  that fragments on the left will be penalized
                                  more, and global alignments on this side
                                  will be favored. This option only affects
                                  the confgurations that allow local
                                  alignments, e.g. -s and -f; unless one of
                                  these options is also activated, this option
                                  has no effect. You have independent control
                                  over local/global alignment behavior for
                                  the N/C (5'/3') termini of your target
                                  sequences using --swentry and --swexit. (Any
                                  numeric value)
   -swexit             float      [0.5] Controls the total probability that is
                                  distributed to local exits from the model,
                                  versus ending an alignment at the end of the
                                  model as in a global alignment.  is a
                                  probability from 0 to 1, and by default is
                                  set to 0.5. Higher values of  mean that
                                  hits that are fragments on their right (C or
                                  3'-terminal) side will be penalized less,
                                  but complete global alignments will be
                                  penalized more. Lower values of  mean
                                  that fragments on the right will be
                                  penalized more, and global alignments on
                                  this side will be favored. This option only
                                  affects the confgurations that allow local
                                  alignments, e.g. -s and -f; unless one of
                                  these options is also activated, this option
                                  has no effect. You have independent control
                                  over local/global alignment behavior for
                                  the N/C (5'/3') termini of your target
                                  sequences using -swentry and -swexit. (Any
                                  numeric value)
   -verbosity          boolean    [N] Print more possibly useful stuff, such
                                  as the individual scores for each sequence
                                  in the alignment.
   -weighting          menu       [G] Values (B)(-wblosum in HMMER) Use the
                                  BLOSUM filtering algorithm to weight the
                                  sequences. Cluster the sequences at a given
                                  percentage identity (see -idlevel); assign
                                  each cluster a total weight of 1.0,
                                  distributed equally amongst the members of
                                  that cluster. (G)(-wgsc in HMMER) Use the
                                  Gerstein/Sonnhammer/Chothia ad hoc sequence
                                  weighting algorithm. This is the default.
                                  (K)(-wme in HMMER) Use the Krogh/Mitchison
                                  maximum entropy algorithm to 'weight' the
                                  sequences. This supercedes the
                                  Eddy/Mitchison/Durbin maximum discrimination
                                  algorithm, which gives almost identical
                                  weights but is less robust. ME weighting
                                  seems to give a marginal increase in
                                  sensitivity over the default GSC weights,
                                  but takes a fair amount of time. (W) (-wpb
                                  in HMMER) Use the Henikoff position-based
                                  weighting scheme. (V) (-wvoronoi in HMMER)
                                  Use the Sibbald/Argos Voronoi sequence
                                  weighting algorithm in place of the default
                                  GSC weighting. (N) (-wnone in HMMER) Turn
                                  off all sequence weighting. (Values: B
                                  (Blosum); G (Gerstein/Sonnhammer/Chothia); K
                                  (Krogh/Mitchison); W (Henikoff); V
                                  (Sibbald/Argos Voronoi); N (None))
   -o                  outfile    [*.ehmmbuild] Re-save the starting alignment
                                  to file, in Stockholm format. The columns
                                  which were assigned to match states will be
                                  marked with x's in an RF annotation line. If
                                  either the -hand or -fast construction
                                  options were chosen, the alignment may have
                                  been slightly altered to be compatible with
                                  Plan 7 transitions, so saving the final
                                  alignment and comparing to the starting
                                  alignment can let you view these
                                  alterations. See the User's Guide for more
                                  information on this arcane side effect.
   -cfile              outfile    [*.ehmmbuild] Save the observed emission and
                                  transition counts to file after the
                                  architecture has been determined (e.g. after
                                  residues/gaps have been assigned to match,
                                  delete, and insert states). This option is
                                  used in HMMER development for generating
                                  data files useful for training new Dirichlet
                                  priors. The format of count files is
                                  documented in the User's Guide.

   Associated qualifiers:

   "-alignfile" associated qualifiers
   -sbegin1            integer    Start of each sequence to be used
   -send1              integer    End of each sequence to be used
   -sreverse1          boolean    Reverse (if DNA)
   -sask1              boolean    Ask for begin/end/reverse
   -snucleotide1       boolean    Sequence is nucleotide
   -sprotein1          boolean    Sequence is protein
   -slower1            boolean    Make lower case
   -supper1            boolean    Make upper case
   -scircular1         boolean    Sequence is circular
   -squick1            boolean    Read id and sequence only
   -sformat1           string     Input sequence format
   -iquery1            string     Input query fields or ID list
   -ioffset1           integer    Input start position offset
   -sdbname1           string     Database name
   -sid1               string     Entryname
   -ufo1               string     UFO features
   -fformat1           string     Features format
   -fopenfile1         string     Features file name

   "-hmmfile" associated qualifiers
   -odirectory2        string     Output directory

   "-o" associated qualifiers
   -odirectory         string     Output directory

   "-cfile" associated qualifiers
   -odirectory         string     Output directory

   General qualifiers:
   -auto               boolean    Turn off prompts
   -stdout             boolean    Write first file to standard output
   -filter             boolean    Read first file from standard input, write
                                  first file to standard output
   -options            boolean    Prompt for standard and additional values
   -debug              boolean    Write debug output to program.dbg
   -verbose            boolean    Report some/full command line options
   -help               boolean    Report command line options and exit. More
                                  information on associated and general
                                  qualifiers can be found with -help -verbose
   -warning            boolean    Report warnings
   -error              boolean    Report errors
   -fatal              boolean    Report fatal errors
   -die                boolean    Report dying program messages
   -version            boolean    Report version number and exit


Input file format

  Alignment and sequence formats

   Input and output of alignments and sequences is limited to the formats
   that the original hmmer supports. These include stockholm, SELEX, MSF,
   Clustal, Phylip and A2M /aligned FASTA (alignments) and FASTA, GENBANK,
   EMBL, GCG, PIR (sequences). It would be fairly straightforward to adapt
   the code to support all EMBOSS-supported formats.

  Compressed input files

   Automatic processing of gzipped files is not supported.

   ehmmbuild reads any normal sequence USAs.

  Input files for usage example

  File: globins50.msf

!!AA_MULTIPLE_ALIGNMENT 1.0
PileUp of: *.pep

 Symbol comparison table: GenRunData:blosum62.cmp  CompCheck: 6430

                   GapWeight: 12
             GapLengthWeight: 4

 pileup.msf  MSF: 308  Type: P  August 16, 1999 09:09  Check: 9858 ..

 Name: lgb1_pea         Len:   308  Check: 2200  Weight:  1.00
 Name: lgb1_vicfa       Len:   308  Check:  214  Weight:  1.00
 Name: myg_escgi        Len:   308  Check: 3961  Weight:  1.00
 Name: myg_horse        Len:   308  Check: 5619  Weight:  1.00
 Name: myg_progu        Len:   308  Check: 6401  Weight:  1.00
 Name: myg_saisc        Len:   308  Check: 6606  Weight:  1.00
 Name: myg_lycpi        Len:   308  Check: 6090  Weight:  1.00
 Name: myg_mouse        Len:   308  Check: 6613  Weight:  1.00
 Name: myg_musan        Len:   308  Check: 3942  Weight:  1.00
 Name: hba_ailme        Len:   308  Check: 4558  Weight:  1.00
 Name: hba_prolo        Len:   308  Check: 5054  Weight:  1.00
 Name: hba_pagla        Len:   308  Check: 5383  Weight:  1.00
 Name: hba_macfa        Len:   308  Check: 5135  Weight:  1.00
 Name: hba_macsi        Len:   308  Check: 5198  Weight:  1.00
 Name: hba_ponpy        Len:   308  Check: 5050  Weight:  1.00
 Name: hba2_galcr       Len:   308  Check: 5609  Weight:  1.00
 Name: hba_mesau        Len:   308  Check: 4702  Weight:  1.00
 Name: hba2_bosmu       Len:   308  Check: 4241  Weight:  1.00
 Name: hba_erieu        Len:   308  Check: 4680  Weight:  1.00
 Name: hba_frapo        Len:   308  Check: 3549  Weight:  1.00
 Name: hba_phaco        Len:   308  Check: 4440  Weight:  1.00
 Name: hba_trioc        Len:   308  Check: 5465  Weight:  1.00
 Name: hba_ansse        Len:   308  Check: 3300  Weight:  1.00
 Name: hba_colli        Len:   308  Check: 3816  Weight:  1.00
 Name: hbad_chlme       Len:   308  Check: 4571  Weight:  1.00
 Name: hbad_pasmo       Len:   308  Check: 6777  Weight:  1.00
 Name: hbaz_horse       Len:   308  Check: 7187  Weight:  1.00
 Name: hba4_salir       Len:   308  Check: 7329  Weight:  1.00
 Name: hbb_ornan        Len:   308  Check: 2667  Weight:  1.00
 Name: hbb_tacac        Len:   308  Check: 4356  Weight:  1.00
 Name: hbe_ponpy        Len:   308  Check: 3827  Weight:  1.00
 Name: hbb_speci        Len:   308  Check: 1556  Weight:  1.00
 Name: hbb_speto        Len:   308  Check: 2051  Weight:  1.00
 Name: hbb_equhe        Len:   308  Check: 3414  Weight:  1.00
 Name: hbb_sunmu        Len:   308  Check: 2927  Weight:  1.00
 Name: hbb_calar        Len:   308  Check: 3836  Weight:  1.00
 Name: hbb_mansp        Len:   308  Check: 4322  Weight:  1.00
 Name: hbb_ursma        Len:   308  Check: 4428  Weight:  1.00
 Name: hbb_rabit        Len:   308  Check: 4190  Weight:  1.00
 Name: hbb_tupgl        Len:   308  Check: 4185  Weight:  1.00


  [Part of this file has been deleted for brevity]

  lgb1_pea  ~~~~~~~~
lgb1_vicfa  ~~~~~~~~
 myg_escgi  ~~~~~~~~
 myg_horse  ~~~~~~~~
 myg_progu  ~~~~~~~~
 myg_saisc  ~~~~~~~~
 myg_lycpi  ~~~~~~~~
 myg_mouse  ~~~~~~~~
 myg_musan  ~~~~~~~~
 hba_ailme  ~~~~~~~~
 hba_prolo  ~~~~~~~~
 hba_pagla  ~~~~~~~~
 hba_macfa  ~~~~~~~~
 hba_macsi  ~~~~~~~~
 hba_ponpy  ~~~~~~~~
hba2_galcr  ~~~~~~~~
 hba_mesau  ~~~~~~~~
hba2_bosmu  ~~~~~~~~
 hba_erieu  ~~~~~~~~
 hba_frapo  ~~~~~~~~
 hba_phaco  ~~~~~~~~
 hba_trioc  ~~~~~~~~
 hba_ansse  ~~~~~~~~
 hba_colli  ~~~~~~~~
hbad_chlme  ~~~~~~~~
hbad_pasmo  ~~~~~~~~
hbaz_horse  ~~~~~~~~
hba4_salir  ~~~~~~~~
 hbb_ornan  ~~~~~~~~
 hbb_tacac  ~~~~~~~~
 hbe_ponpy  ~~~~~~~~
 hbb_speci  ~~~~~~~~
 hbb_speto  ~~~~~~~~
 hbb_equhe  ~~~~~~~~
 hbb_sunmu  ~~~~~~~~
 hbb_calar  ~~~~~~~~
 hbb_mansp  ~~~~~~~~
 hbb_ursma  ~~~~~~~~
 hbb_rabit  ~~~~~~~~
 hbb_tupgl  ~~~~~~~~
 hbb_triin  ~~~~~~~~
 hbb_colli  ~~~~~~~~
 hbb_larri  ~~~~~~~~
hbb1_varex  ~~~~~~~~
hbb2_xentr  ~~~~~~~~
hbbl_ranca  ~~~~~~~~
hbb2_tricr  ~~~~~~~~
glb2_mormr  ~~~~~~~~
glbz_chith  FGAVFAKM
hbf1_ureca  VAAMK~~~


Output file format

   ehmmbuild outputs a graph to the specified graphics device. outputs a
   report format file. The default format is ...

  Output files for usage example

  File: globin.hmm

HMMER2.0  [2.3.2]
NAME  globins50
LENG  143
ALPH  Amino
RF    no
CS    no
MAP   yes
COM   /homes/user/local/bin/hmmbuild -n globins50 --pbswitch 1000 --archpri 0.85
0000 --idlevel 0.620000 --swentry 0.500000 --swexit 0.500000 --wgsc -A -F globin
.hmm ../../data/hmmnew/globins50.msf
NSEQ  50
DATE  Mon Jul 15 12:00:00 2013
CKSUM 9858
XT      -8455     -4  -1000  -1000  -8455     -4  -8455     -4
NULT      -4  -8455
NULE     595  -1558     85    338   -294    453  -1158    197    249    902  -10
85   -142    -21   -313     45    531    201    384  -1998   -644
HMM        A      C      D      E      F      G      H      I      K      L
 M      N      P      Q      R      S      T      V      W      Y
         m->m   m->i   m->d   i->m   i->i   d->m   d->d   b->m   m->e
         -450      *  -1900
     1    591  -1587    159   1351  -1874   -201    151  -1600    998  -1591   -
693    389  -1272    595     42    -31     27   -693  -1797  -1134    14
     -   -149   -500    233     43   -381    399    106   -626    210   -466   -
720    275    394     45     96    359    117   -369   -294   -249
     -    -23  -6528  -7571   -894  -1115   -701  -1378   -450      *
     2   -926  -2616   2221   2269  -2845  -1178   -325  -2678   -300  -2596  -1
810    220  -1592    939   -974   -671   -939  -2204  -2785  -1925    15
     -   -149   -500    233     43   -381    399    106   -626    210   -466   -
720    275    394     45     96    359    117   -369   -294   -249
     -    -23  -6528  -7571   -894  -1115   -701  -1378      *      *
     3   -638  -1715   -680    497  -2043  -1540     23  -1671   2380  -1641   -
840   -222  -1595    437   1040   -564   -523  -1363   2124  -1313    16
     -   -149   -500    233     43   -381    399    106   -626    210   -466   -
720    275    394     45     96    359    117   -369   -294   -249
     -    -23  -6528  -7571   -894  -1115   -701  -1378      *      *
     4    829  -1571    -37    660  -1856   -873    152  -1578    894  -1573   -
678    769  -1273   1284     58    224    447  -1175  -1782  -1125    17
     -   -149   -500    233     43   -381    399    106   -626    210   -466   -
720    275    394     45     96    359    117   -369   -294   -249
     -    -23  -6528  -7571   -894  -1115   -701  -1378      *      *
     5    369   -433   -475    286   -974  -1312    -19   -412    664    398
406   1030  -1394    388   -214   -261     85   -166  -1227   -725    18
     -   -149   -500    233     43   -381    399    106   -626    210   -466   -
720    275    394     45     96    359    117   -369   -294   -249
     -    -23  -6528  -7571   -894  -1115   -701  -1378      *      *
     6  -1291   -884  -3696  -3261  -1137  -3425  -2802   2322  -3066    111
 19  -3028  -3275  -2855  -3100  -2670  -1269   2738  -2450  -2062    19
     -   -149   -500    233     43   -381    399    106   -626    210   -466   -
720    275    394     45     96    359    117   -369   -294   -249
     -    -23  -6528  -7571   -894  -1115   -701  -1378      *      *
     7    157   -413   -236    316  -1387  -1231     89   -863   1084   -431   -
348    910  -1319    635    297     15    704   -483  -1497   -922    20
     -   -149   -500    233     43   -381    399    106   -626    210   -466   -
720    275    394     45     96    359    117   -369   -294   -249
     -    -23  -6528  -7571   -894  -1115   -701  -1378      *      *
     8    770  -1431    -43    459  -1751   -340     78  -1449    440  -1497   -
631    866  -1302    825    -51    953    364  -1076  -1750  -1121    21
     -   -149   -500    233     43   -381    399    106   -626    210   -466   -
720    275    394     45     96    359    117   -369   -294   -249
     -    -23  -6528  -7571   -894  -1115   -701  -1378      *      *
     9    420   -186  -2172  -1577      8  -1818   -694   1477  -1281    760
614  -1299  -1867  -1001  -1262   -189    -12   1401   -722   -364    22
     -   -149   -500    233     43   -381    399    106   -626    210   -466   -
720    275    394     45     96    359    117   -369   -294   -249
     -    -23  -6528  -7571   -894  -1115   -701  -1378      *      *
    10   -961   -879  -2277  -1821   1366  -2213   -204   -399  -1500   -130
-39  -1427  -2266  -1186  -1511   -159   -913   -367   4721   1177    23
     -   -149   -500    233     43   -381    399    106   -626    210   -466   -
720    275    394     45     96    359    117   -369   -294   -249
     -    -23  -6528  -7571   -894  -1115   -701  -1378      *      *
    11    -48  -1782    809    844  -2073   1456      8  -1811    315  -1803   -
932    180  -1365    921   -218    173   -115  -1399  -2018  -1327    24
     -   -149   -500    233     43   -381    399    106   -626    210   -466   -
720    275    394     45     96    359    117   -369   -294   -249
     -    -68  -6528  -4832   -894  -1115   -701  -1378      *      *


  [Part of this file has been deleted for brevity]

     -   -149   -500    233     43   -381    399    106   -626    210   -466   -
720    275    394     45     96    359    117   -369   -294   -249
     -    -23  -6528  -7571   -894  -1115   -701  -1378      *      *
   128   -415  -1926   1575   1399  -2219  -1163     17  -1983    527  -1929  -1
039    341  -1367   1597   -212    257   -222  -1536  -2109  -1387   144
     -   -149   -500    233     43   -381    399    106   -626    210   -466   -
720    275    394     45     96    359    117   -369   -294   -249
     -    -23  -6528  -7571   -894  -1115   -701  -1378      *      *
   129   -529  -1434   -629   -143  -1926   -626   -171  -1460   2679  -1597   -
839   -309  -1599    207    317   -530   -510   -130  -1840  -1369   145
     -   -149   -500    233     43   -381    399    106   -626    210   -466   -
720    275    394     45     96    359    117   -369   -294   -249
     -    -23  -6528  -7571   -894  -1115   -701  -1378      *      *
   130    811   -397  -2389  -1807   1883  -2039   -907    594  -1512   1077
687  -1532  -2065  -1201  -1483  -1125   -465   1067   -843   -472   146
     -   -149   -500    233     43   -381    399    106   -626    210   -466   -
720    275    394     45     96    359    117   -369   -294   -249
     -    -23  -6528  -7571   -894  -1115   -701  -1378      *      *
   131   -241   -102  -2327  -1710    724  -1767   -616    650  -1363   1074   1
765   -718  -1809  -1026  -1252   -842   -181   1331   -541    695   147
     -   -149   -500    233     43   -381    399    106   -626    210   -466   -
720    275    394     45     96    359    117   -369   -294   -249
     -    -23  -6528  -7571   -894  -1115   -701  -1378      *      *
   132    723     95    385    823  -1820  -1168    167  -1540    875  -1362   -
644    320  -1261    810    246    693    -67  -1141  -1753  -1098   148
     -   -149   -500    233     43   -381    399    106   -626    210   -466   -
720    275    394     45     96    359    117   -369   -294   -249
     -    -23  -6528  -7571   -894  -1115   -701  -1378      *      *
   133    551   -430  -1049   -481   -442    469   -241    465   -313    133
947   -411  -1543    197   -587   -146    202    522   -843   -429   149
     -   -149   -500    233     43   -381    399    106   -626    210   -466   -
720    275    394     45     96    359    117   -369   -294   -249
     -    -23  -6528  -7571   -894  -1115   -701  -1378      *      *
   134  -1086   -777  -3351  -2800    816  -2898  -1861   1501  -2515   1149
586  -2483  -2775  -2108  -2400  -2046  -1030   2380  -1511  -1216   150
     -   -149   -500    233     43   -381    399    106   -626    210   -466   -
720    275    394     45     96    359    117   -369   -294   -249
     -    -23  -6528  -7571   -894  -1115   -701  -1378      *      *
   135   1393   1409   -876   -345   -997   -525   -315   -590   -198   -847   -
109   -420  -1441    -97    412    766   -130    139  -1306   -858   151
     -   -149   -500    233     43   -381    399    106   -626    210   -466   -
720    275    394     45     96    359    117   -369   -294   -249
     -    -23  -6528  -7571   -894  -1115   -701  -1378      *      *
   136     98  -1299     36    365  -1495  -1211   1241   -404    523   -952   -
426   1174  -1303    511    -18    347    882   -853  -1566   -970   152
     -   -149   -500    233     43   -381    399    106   -626    210   -466   -
720    275    394     45     96    359    117   -369   -294   -249
     -    -23  -6528  -7571   -894  -1115   -701  -1378      *      *
   137   1308   -787    564   -132   -966  -1332   -203   -362    -49   -395
-57   -305  -1481     49   -437   -190   -182   1020  -1282   -802   153
     -   -149   -500    233     43   -381    399    106   -626    210   -466   -
720    275    394     45     96    359    117   -369   -294   -249
     -    -23  -6528  -7571   -894  -1115   -701  -1378      *      *
   138  -1746  -1358  -3897  -3341   -216  -3621  -2478   1774  -3040   2442   1
157  -3189  -3229  -2422  -2853  -2824  -1659    392  -1720  -1647   154
     -   -149   -500    233     43   -381    399    106   -626    210   -466   -
720    275    394     45     96    359    117   -369   -294   -249
     -    -23  -6528  -7571   -894  -1115   -701  -1378      *      *
   139   1176  -1289   -179    534  -1606   -607     34  -1278    734  -1372   -
534     44  -1325    433    -89    521    826   -941  -1666  -1072   155
     -   -149   -500    233     43   -381    399    106   -626    210   -466   -
720    275    394     45     96    359    117   -369   -294   -249
     -    -23  -6528  -7571   -894  -1115   -701  -1378      *      *
   140    602  -1500   -135    850  -1753  -1214   1951  -1452    838  -1484
431    118  -1306    555    347    489   -153  -1085  -1723  -1092   156
     -   -149   -500    233     43   -381    399    106   -626    210   -466   -
720    275    394     45     96    359    117   -369   -294   -249
     -    -22  -6602  -7644   -894  -1115   -701  -1378      *      *
   141    351  -1646   -165    546  -1976   -498     46  -1667   2193  -1662   -
798     35  -1405    476    311    -73   -306  -1287  -1859  -1254   157
     -   -149   -500    233     43   -381    399    106   -626    210   -466   -
720    275    394     45     96    359    117   -369   -294   -249
     -    -23  -6561  -7603   -894  -1115   -701  -1378      *      *
   142  -1995  -1606  -3095  -2870   1739  -3015    -98  -1012  -2520   -730
655  -1990  -2962  -1884  -2326  -2167  -1915  -1128    548   4089   158
     -   -149   -500    233     43   -381    399    106   -626    210   -466   -
720    275    394     45     96    359    117   -369   -294   -249
     -    -25  -6455  -7497   -894  -1115   -701  -1378      *      *
   143   -253  -1373   -267    301   -911   -565   1956   -450   1188  -1330   -
497     33  -1352    502   1358   -205   -184   -941  -1604  -1026   159
     -      *      *      *      *      *      *      *      *      *      *
  *      *      *      *      *      *      *      *      *      *
     -      *      *      *      *      *      *      *      *      0
//

Data files

   None.

Notes

  1. Command-line arguments

   The following original HMMER options are not supported:
-h         : Use -help to get help information instead.
-f         : Use -strategy option instead.
-g         : Use -strategy option instead.
-s         : Use -strategy option instead.
-A         : Set append: "N" or append: "Y" for "hmmfile" in the ACD file instea
d.
-F         : Always set (an existing hmmfile will be overwritten).
-amino     : Sequence alignment type is specified via the ACD file.
-nucleic   : Sequence alignment type is specified via the ACD file.
-informat  : All common alignment formats are supported automatically.
-wblosum   : Use -weighting option to specify the sequence weighting algorithm.
-wgsc      : Use -weighting option to specify the sequence weighting algorithm.
-wme       : Use -weighting option to specify the sequence weighting algorithm.
-wnone     : Use -weighting option to specify the sequence weighting algorithm.
-wpb       : Use -weighting option to specify the sequence weighting algorithm.
-wvoronoi  : Use -weighting option to specify the sequence weighting algorithm.
-verbose   : Use -verbosity instead.

   The following additional options are provided:
-weighting : Sequence weighting algorithm.
-n         : Use -nhmm instead (-n causes problems for GUI developers)

  2. Installing EMBASSY HMMER

   The EMBASSY HMMER package contains "wrapper" applications providing an
   EMBOSS-style interface to the applications in the original HMMER
   package version 2.3.2 developed by Sean Eddy. Please read the file
   INSTALL in the EMBASSY HMMER package distribution for installation
   instructions.

  3. Installing original HMMER

   To use EMBASSY HMMER, you will first need to download and install the
   original HMMER package. Please read the file 00README in the the
   original HMMER package distribution for installation instructions:
WWW home:       http://hmmer.wustl.edu/
Distribution:   ftp://ftp.genetics.wustl.edu/pub/eddy/hmmer/

  4. Setting up HMMER

   For the EMBASSY HMMER package to work, the directory containing the
   original HMMER executables *must* be in your path. For example if you
   executables were installed to "/usr/local/hmmer/bin", then type:
set path=(/usr/local/hmmer/bin/ $path)
rehash

  5. Getting help

   Please read the Userguide.pdf distributed with the original HMMER and
   included in the EMBASSY HMMER distribution under the DOCS directory.
   The first 3 chapters (Introduction, Installation and Tutorial) are
   particularly useful.

   Please read the 'Notes' section below for a description of the
   differences between the original and EMBASSY HMMER, particularly which
   application command line options are supported.

References

   None.

Warnings

  Types of input data

   hmmer v3.2.1 and therefore EMBASSY HMMER is only recommended for use
   with protein sequences. If you provide a non-protein sequence you will
   be reprompted for a protein sequence. To accept nucleic acid sequences
   you must replace instances of < type: "protein" > in the application
   ACD files with .

  Environment variables

   The original hmmer uses BLAST environment variables (below), if
   defined, to locate files. The EMBASSY HMMER does not.
BLASTDB   location of sequence databases to be searched
BLASMAT   location of substitution matrices
HMMERDB   location of HMMs

  Alignment input

   The user must provide the full filename of an alignment for the
   "alignfile" ACD option, not an indirect reference to a set of
   sequences, e.g. a USA is NOT acceptable. This is because hmmbuild
   (which ehmmbuild wraps) requires an alignment and does not support
   USAs.

Diagnostic Error Messages

   None.

Exit status

   It always exits with status 0.

Known bugs

   None.

See also

                    Program name                      Description
                    ehmmalign     Align sequences to an HMM profile
   ehmmcalibrate    Calibrate HMM search statistics
                    ehmmconvert   Convert between profile HMM file formats
                    ehmmemit      Generate sequences from a profile HMM
                    ehmmfetch     Retrieve an HMM from an HMM database
                    ehmmindex     Create a binary SSI index for an HMM database
                    ehmmpfam      Search one or more sequences against an HMM database
                    ehmmsearch    Search a sequence database with a profile HMM
                    libgen        Generate discriminating elements from alignments
                    ohmmalign     Align sequences with an HMM
                    ohmmbuild     Build HMM
   ohmmcalibrate    Calibrate a hidden Markov model
                    ohmmconvert   Convert between HMM formats
                    ohmmemit      Extract HMM sequences
                    ohmmfetch     Extract HMM from a database
                    ohmmindex     Index an HMM database
                    ohmmpfam      Align single sequence with an HMM
                    ohmmsearch    Search sequence database with an HMM

Author(s)

                    This program is an EMBOSS conversion of a program written by Sean Eddy
                    as part of his HMMER package.

                    Please report all bugs to the EMBOSS bug team
                    (emboss-bug (c) emboss.open-bio.org) not to the original author. Jon
                    Ison
   European         Bioinformatics Institute, Wellcome Trust Genome Campus,
   Hinxton,         Cambridge CB10 1SD, UK

                    Please report all bugs to the EMBOSS bug team
                    (emboss-bug (c) emboss.open-bio.org) not to the original author.

                    This program is an EMBASSY wrapper to a program written by Sean Eddy as
                    part of his hmmer package.

                    Please report any bugs to the EMBOSS bug team in the first instance,
                    not to Sean Eddy.

History

Target users

                    This program is intended to be used by everyone and everything, from
                    naive users to embedded scripts.

Comments

None
