FACTOID # 22: South Dakota has the highest employment ratio in America, but the lowest median earnings of full-time male employees.
 Home   Encyclopedia   Statistics   States A-Z   Flags   Maps   FAQ   About 


FACTS & STATISTICS    Advanced view

Search encyclopedia, statistics and forums:



(* = Graphable)



Encyclopedia > Protein structural alignment

Protein structural alignment is a form of alignment which tries to establish equivalences between two or more protein structures based on their fold. In contrast to simple structural superposition, where at least some equivalent residues of the two structures are known, structural alignment requires no a priori knowledge of equivalent positions. Structural alignment is a valuable tool for the comparison of proteins in the so called "twilight zone" and "midnight zone" of homology, where relationships between proteins can't be detected by sequence alignment methods. The method can therefore be used to establish evolutionary relationships between proteins that share no or nearly no common primary structure. This is especially important in the light of structural genomics and proteomics projects. The result of a structural alignment of two proteins is a superposition of their atomic coordinate sets with a minimal root mean square deviation (RMSD) between the two structures. An alignment refers to adjustment of an object in relation with other objects. ... A representation of the 3D structure of myoglobin, showing coloured alpha helices. ... Protein folding is the process by which a protein structure assumes its functional shape or conformation. ... In biology, two or more structures are said to be homologous if they are alike because of shared ancestry. ... Sequence alignment is an arrangement of two or more sequences, highlighting their similarity. ... A protein primary structure is a chain of amino acids. ... Structural genomics or structural bioinformatics refers to the analysis of macromolecular structure particularly proteins. ... Proteomics is the large-scale study of proteins, particularly their structures and functions. ... See Cartesian coordinate system or Coordinates (elementary mathematics) for a more elementary introduction to this topic. ... In mathematics, the root mean square or rms is a statistical measure of the magnitude of a varying quantity. ...


Visualization of Structural Alignment

How similar are two Immunoglobulin structures? Use any one of the available structural alignment algorithms (see Packages) to superimpose two protein structures. Schematic of antibody binding to an antigen An antibody is a protein complex used by the immune system to identify and neutralize foreign objects like bacteria and viruses. ... Proteins are amino acid chains, made up from 20 different L-α-amino acids, also referred to as residues, that fold into unique three-dimensional protein structures. ...

From structural alignment, you can extract percent of structural identity (PSI), structurally implied sequence alignment, root mean square deviation (RMSD), and a score of the alignment. In mathematics, the root mean square or rms is a statistical measure of the magnitude of a varying quantity. ...

The PSI can be easily calculated by normalizing the number of aligned residues by the length of the shortest structure (N / norm) where N is the number of the corresponded residues that are within a Cartesian distance of 4 Å; and "norm" is the normalization factor. In the Immunoglobulin example the number of aligned residues within 4 Å is 57 and the norm of the set is 83, the PSI is therefore 68.67%.

Structurally implied sequence alignment is a one dimensional representation of the structural alignment.


RMSD is then calculated by using the distances between the corresponding residues in the alignment.


Up to now there is no definitive algorithmic solution to protein structural alignment. It could be shown that the alignment problem is NP-hard. All current algorithms employ heuristic methods. Therefore different algorithms may not produce exactly the same results for the same alignment problem. In computational complexity theory, NP-hard (Non-deterministic Polynomial-time hard) refers to the class of decision problems that contains all problems H such that for all decision problems L in NP there is a polynomial-time many-one reduction to H. Informally this class can be described as containing... Heuristic is the art and science of discovery and invention. ...

Representation of structures

Protein structures have to be represented in some coordinate independent space to make them comparable. One possible representation is the so-called distance matrix, which is a two-dimensional matrix containing all pairwise distance between all Cα atoms of the protein backbone. This can also be represented as a set of overlapping sub-matrices spanning only fragments of the protein. Another possible representation is the reduction of the protein structure to the level of secondary structure elements (SSEs), which can be represented as vectors, and can carry additional information about relationships to other SSEs, as well as about certain biophysical properties. In mathematics, a distance matrix is a matrix (two-dimensional array) containing the distances, taken pairwise, of a set of points. ... In mathematics, a matrix (plural matrices) is a rectangular table of numbers or, more generally, of elements of a ring-like algebraic structure. ... A representation of the 3D structure of the Myoglobin protein. ... In physics and in vector calculus, a spatial vector is a concept characterized by a magnitude, which is a scalar, and a direction (which can be defined in a 3-dimensional space by the Euler angles). ...

Comparison and Optimization

In the case of distance matrix representation, the comparison algorithm breaks down the distance matrices into regions of overlap, which are then again combined if there is overlap between adjacent fragments, thereby extending the alignment. If the SSE representation is chosen, there are several possibilities. One can search for the maximum ensemble of equivalent SSE pairs using algorithms to solve the maximum clique problem from graph theory. Other approaches employ dynamic programming or combinatorial simulated annealing. In mathematics, a distance matrix is a matrix (two-dimensional array) containing the distances, taken pairwise, of a set of points. ... In computational complexity theory, the clique problem or k-clique problem is a graph-theoretical NP-complete problem. ... A graph diagram of a graph with 6 vertices and 7 edges. ... In computer science, dynamic programming is a method for reducing the runtime of algorithms exhibiting the properties of overlapping subproblems and optimal substructure, described below. ... Simulated annealing (SA) is a generic probabilistic meta-algorithm for the global optimization problem, namely locating a good approximation to the global optimum of a given function in a large search space. ...


Several tools for pairwise and multiple structural alignments are available on the web:

NAME Description Class Type Link Author Year
MAMMOTH MAtching Molecular Models Obtained from Theory Pair server AR. Ortiz 2002
CE/CE-MC Combinatorial Extension -- Monte Carlo Multi server I. Shindyalov 2000
DaliLite Distance Matrix Alignment C-Map Pair server L. Holm 1993
VAST Vector Alignment Search Tool SSE Pair server S. Bryant 1996
PrISM Protein Informatics Systems for Modeling SSE Multi server B. Honig 2000
SSAP Sequential Structure Alignment Program SSE Multi server C. Orengo 1989
SARF2 Spatial Arrangements of Backbone Fragments SSE Pair server D. Fischer 1996
KENOBI/K2 NA SSE Pair server Z. Weng 2000
STAMP STructural Alignment of Multiple Proteins Sequence Pair server R. Russell and G. Barton 1992
MASS Multiple Alignment by Secondary Structure SSE Multi server R. Nussinov 2003
MALECON NA Geometry Multi NA S. Wodak 2004
MultiProt NA Geometry Multi server R. Nussinov 2004
SCALI Structural Core ALIgnment of proteins Sequence Pair server C. Bystroff 2004
DEJAVU NA SSE Pair server GJ. Kleywegt 1997
SSM Secondary Structure Matching SSE/Cα Pair & Multi server E. Krissinel 2003
SHEBA Structural Homology by Environment-Based Alignment Sequence Pair server B. Lee 2000
LGA Local-Global Alignment Sequence Pair server A. Zemla 2003
POSA Partial Order Structure Alignment Multi server Y. Ye and A. Godzik 2005
FATCAT Flexible Structure AlignmenT by Chaining Aligned Fragment Pairs Allowing Twists Pair server Y. Ye and A. Godzik 2004
Matras MArkovian TRAnsition of protein Structure Cα & SSE Pair NA K. Nishikawa 2000
MAMMOTH-mult MAMMOTH-based multiple structure alignment Multi server D. Lupyan 2005
Protein3Dfit NA C-Map Pair server D. Schomburg 1994
PRIDE PRobaility of IDEntity Pair server S. Pongor 2002
FAST FAST Alignment and Search Tool Pair server J. Zhu 2004
C-BOP Coordinate-Based Organization of Proteins N/A Multi server E. Sandelin 2005
ProFit Protein least-squares Fitting Multi server ACR. Martin 1996
TOPOFIT Alignment as a superimposition of common volumes at a topomax point Pair server V. A. Ilyin 2004
MUSTANG MUltiple STructural AligNment AlGorithm Cα & C-Map Multi download A.S. Konagurthu et al. 2005
URMS Unit-vector RMSD Pair server K. Kedem 2003

Key map:

  • -- Backbone Atom (Cα) Alignment;
  • SSE -- Secondary Structure Elements Alignment;
  • Pair -- Pairwise Alignment (2 structures *only*);
  • Multi -- Multiple Structure Alignment (MStA);
  • C-Map -- Contact Map

See also

Sequence alignment is an arrangement of two or more sequences, highlighting their similarity. ... Structural Classification of Proteins is a way to classify proteins. ...


  • Bourne, P.E & Shindyalov, I.N. (2003): Structure Comparison and Alignment. In: Bourne, P.E., Weissig, H. (Eds): Structural Bioinformatics. Hoboken NJ: Wiley-Liss. ISBN 0-471-20200-2
  • Olmea O, Straus CE, Ortiz AR. (2002) MAMMOTH (matching molecular models obtained from theory): an automated method for model comparison. Protein Sci 11,2606-21
  • Yuan X, and Bystroff C.(2004) "Non-sequential Structure-based Alignments Reveal Topology-independent Core Packing Arrangements in Proteins", Bioinformatics. Nov 5, 2004
  • E. Krissinel and K. Henrick, Protein structure comparison in 3D based on secondary structure matching (SSM) followed by C-alpha alignment, scored by a new structural similarity function. In: A.J. Kungl and P.J. Kungl, Editors, Proceedings of the Fifth international Conference on Molecular Structural Biology, Vienna, September 3-7 (2003), p. 88.
  • E. Krissinel and K. Henrick (2004). Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions. Acta Cryst. D60, 2256-2268
  • E. Krissinel and K. Henrick (2005). Multiple Alignment of Protein Structures in Three Dimensions. In: M.R. Berthold et.al. (Eds.): CompLife 2005, LNBI 3695, pp. 67-78. Springer-Verlag Berlin Heidelberg.
  • Jung, J. and Lee, B.: Protein structure alignment using environmental profiles. Protein Engineering. 13:535-543, 2000.
  • Zemla A., "LGA - a Method for Finding 3D Similarities in Protein Structures", Nucleic Acids Research, 2003, Vol. 31, No. 13, pp. 3370-3374.
  • Y. Ye, A. Godzik "Multiple flexible structure alignment using partial order graphs" Bioinformatics, 2005, Vol. 21, No. 10, pp. 2362-2369 Abstract
  • T. Kawabata, K. Nishikawa "Protein structure comparison using the Markov transition model of evolution" Proteins; 41, 1, pp108-122
  • D. Lupyan, A. Leo-Macias, AR. Ortiz. "A new progressive-iterative algorithm for multiple structure alignment." Bioinformatics, 2005, Vol.xxx, Epub Jun 7th
  • U. Lessel, D. Schomburg. "Similarities between protein 3-D structures". Protein Engineering (1994), 7, 1175-1187
  • Valentin A. Ilyin, Alexej Abyzov, and Chesley M.Leslin. "Structural alignment of proteins by a novel TOPOFIT method, as a superimposition of common volumes at a topomax point." Protein Science (2004), 13:1865-1874.
  • Konagurthu, A.S., Whisstock, J.C., and Stuckey, P.J., Lesk, A.M. "MUSTANG: A MUltiple STructural AligNment AlGorithm", (2005) Proteins: Structure, Function, and Bioinformatics. (to appear)[1]
  • Yona G, Kedem K. "The URMS-RMS hybrid algorithm for fast and sensitive local protein structure alignment." J Comput Biol. 2005;12(1):12-32.

Proteins: key methods of study

Protein methods | Protein purification | Protein structure prediction | Protein-protein docking | Green fluorescent protein | Western blot | Protein immunostaining | Protein sequencing | Gel electrophoresis | Protein immunoprecipitation | Protein structural alignment | Protein ontology | Peptide mass fingerprinting A representation of the 3D structure of myoglobin, showing coloured alpha helices. ... Protein methods are the techniques used to study proteins. ... Protein methods are the techniques used to study proteins. ... Protein purification is the process of isolating proteins from a homogenate, which may comprise cell and tissue components, including DNA, cell membrane and other proteins. ... Protein structure prediction is one of the most significant tasks tackled in computational structural biology and theoretical chemistry. ... It has been suggested that this article or section be merged with Protein-protein interactions. ... GFP ribbon diagram from PDB database The green fluorescent protein (GFP) is a protein from the jellyfish Aequorea victoria that fluoresces green when exposed to blue light. ... Picture of a Western blot with 5 vertical lanes A western blot (a. ... Immunostaining is a general term in biochemistry in that applies to any use of an antibody and some colouring agent to detect a specific protein in a sample. ... Proteins are found in every cell and are essential to every biological process, protein structure is very complex: determining a proteins structure involves first protein sequencing - determining the amino acid sequences of its constituent peptides; and also determining what conformation it adopts and whether it is complexed with any... SDS-PAGE autoradiography DNA agarose gel Gel electrophoresis is a group of techniques used by scientists to separate molecules based on physical characteristics such as size, shape, or isoelectric point. ... Immunoprecipitation is the technique of precipitating an antigen out of solution using an antibody specific to that antigen. ... Protein ontology or Proteome Ontology is a research tool of proteomics, similar to the scientific classification system used in biology. ... Peptide mass fingerprinting (PMF) is an analytical technique for protein identification that was developed by John Yates and colleagues (3). ...

Assay: Enzyme assay | Protein assay | Secretion assay An assay is a procedure where the concentration of a component part of a mixture is determined. ... Enzyme assays are laboratory methods for measuring enzymatic activity. ... The Bradford protein assay is a spectroscopic analytical procedure used to measure the concentration of protein in a solution. ... Secretion assay is a process used in cell biology to identify cells that are secreting a particular protein (usually a cytokine). ...

  Results from FactBites:
Structural alignment - Wikipedia, the free encyclopedia (2794 words)
Structural alignment of thioredoxins from humans and the fly Drosophila melanogaster.
Structural alignments are especially useful in analyzing data from structural genomics and proteomics efforts, and they can be used as comparison points to evaluate alignments produced by purely sequence-based bioinformatics methods.
The structural alignment also implies a corresponding one-dimensional sequence alignment from which a sequence identity, or the percentage of residues that are identical between the input structures, can be calculated as a measure of how closely the two sequences are related.
BioMed Central | Full text | Assessing local structural perturbations in proteins (5280 words)
To assess such protein structural perturbations, structures are usually compared in a detailed way, by looking at the position and orientation of individual atoms, residues, or secondary structures (for instance, see [1]).
A structural superposition of the two whole structures is performed as usual: the complete structures are used, the rotations are centered at the center of mass of the structures, and the RMSd for the whole structure is minimized using the quaternion method.
A three-dimensional structural model of a protein is a powerful asset in the investigation of its biological function (for instance, see [18,19]), but producing such a model through experimental determinations is not always easy or even possible.
  More results at FactBites »



Share your thoughts, questions and commentary here
Your name
Your comments

Want to know more?
Search encyclopedia, statistics and forums:


Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms, 1022, m