Protein Tertiary Structures: Prediction from Amino Acid Sequences

Protein tertiary structures contain key information for the understanding of the relationship between protein amino acid sequences and their biological functions. A large collection of computational algorithms has been developed to predict protein tertiary structures from their sequences in computers.

Keywords: protein structure prediction; protein folding; homology modelling; threading; ab initio prediction; Hidden Markov Model; structure genomics

Figure 1. Procedure for predicting a protein structure from its amino acid sequence.
Figure 2. Schematic view of ab initio prediction methods (revised from Lin, 1996).
close
 References
    Altschul SF, Gish W, Miller W, Myers EW and Lipman DJ (1990) Basic local alignment search tool. Journal of Molecular Biology 215: 403–410.
    Altschul SF, Madden TL, Schaffer AA et al. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research 25: 3389–3402.
    Bowie JU, Luthy R and Eisenberg D (1991) A method to identify protein sequences that fold into a known three-dimensional structure. Science 253: 164–170.
    Bryant SH and Lawrence CE (1993) An empirical energy function for threading protein-sequence through the folding motif. Proteins 16: 92–112.
    Brooks BR, Bruccoleri RE, Olafson BD, States DJ, Swaminathan S and Karplus M (1993) CHARMM: a program for macromolecular energy minimization, and dynamics calculations. Journal of Computational Chemistry 4: 187–217.
    book Dayhoff MO, Schwartz RM and Orcutt BC (1978) "A model of evolutionary change in protein matrices for detecting distant relationships". In: Dayhoff MO (ed.) Atlas of Protein Sequence and Structure, 5, supplement 3, pp. 345–352. Washington, DC: National Biomedical Research Foundation.
    Di Francesco V, Garnier J and Munson PJ (1997a) Protein topology recognition from secondary structure sequences: application of the hidden Markov models to the alpha class proteins. Journal of Molecular Biology 267: 446–463.
    book Di Francesco V, Geetha V, Garnier J and Munson PJ (1997b) "Fold recognition using predicted secondary structure sequences and hidden Markov models of protein folds". Proteins (supplement 1): 123–128.
    Duan Y and Kollman PA (1998) Pathways to a protein folding intermediate observed in a 1-microsecond simulation in aqueous solution. Science 282: 740–744.
    book Dunbrack RL Jr (1999) "Comparative modeling of CASP3 targets using PSI-BLAST and SCWRL". Proteins (supplement 3): 81–87.
    book Durbin R, Eddy S, Krogh A and Mitchison G (1998) Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge: Cambridge University Press.
    Godzik A and Skolnick J (1992) Sequence–structure matching in globular proteins: application to supersecondary and tertiary structure determination. Proceedings of the National Academy of Sciences of the USA 89: 12098–12102.
    Henikoff S and Henikoff JG (1992) Amino acid substitution matrices from protein blocks. Proceedings of the National Academy of Sciences of the USA 89: 10915–10919.
    Higgins DG and Sharp PM (1989) CLUSTAL: a package for performing multiple sequence alignments on a microcomputer. Gene 73: 237–244.
    Jones DT, Taylor WR and Thornton JM (1992) A new approach to protein fold recognition. Nature 358: 86–89.
    book Leach AR (1996) Molecular Modelling: Principles and Applications. Essex: Addison Wesley Longman.
    other Lin D (1996) Knowledge-based Protein Fold and Folding Study. PhD thesis, Peking University, p. 76.
    Luthy R, Bowie JU and Eisenberg D (1992) Assessment of protein models with three-dimensional profiles. Nature 356: 83–85.
    Madej T, Gibrat JF, Bryant SH (1995) Threading a database of protein cores. Proteins 23: 356–369.
    Moult J, Hubbard T, Fidelis K and Pedersen JT (1999) Critical assessment of methods of protein structure prediction (CASP): round III. Proteins(supplement 3): 2–6.
    Needleman SB and Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequences of two proteins. Journal of Molecular Biology 48: 443–453.
    Ngo JT and Marks J (1992) Computational complexity of a problem in molecular structure prediction. Protein Engineering 5: 313–321.
    Orengo CA, Bray JE, Hubbard T, LoConte L and Sillitoe I (1999) Analysis and assessment of ab initio three-dimensional prediction, secondary structure, and contacts prediction. Proteins (supplement 3): 149–170.
    Ortiz AR, Kolinski A, Rotkiewicz P, Ilkowski B and Skolnik J (1999) Ab initio folding of proteins using restraints derived from evolutionary information. Proteins(supplement 3): 177–185.
    Pearlman DA, Case DA, Caldwell JW et al. (1995) AMBER, a package of computer programs for applying molecular mechanics, normal mode analysis, molecular dynamics and free energy calculations to simulate the structural and energetic properties of molecules. Computational Physics Communications 91: 1–41.
    Pearson WR and Lipman DJ (1988) Improved tools for biological sequence comparison. Proceedings of the National Academy of Sciences of the USA 85: 2444–2448.
    Pearson WR (1990) Rapid and sensitive sequence comparison with PASTP and FASTA. Methods in Enzymology 183: 63–98.
    Pederson JT and Moult J (1997) Ab initio protein folding simulations with genetic algorithms: simulations on the complete sequence of small proteins. Proteins(supplement 1): 179–184.
    Rost B and Sander C (1994) Combining evolutionary information and neural networks to predict protein secondary structure. Proteins 19: 55–72.
    Sanchez R, Pieper U, Melo F et al. (2000) Protein structure modeling for structural genomics. Nature Structural Biology 7 (supplement): 986–990.
    Simons KT, Bonneau R, Ruczinski I and Baker D (1999) Ab initio protein structure prediction of CASP III targets using Rosetta. Proteins(supplement 3): 171–176.
    Sippl MJ (1990) Calculation of conformational ensembles from potentials of mean force. An approach to the knowledge-based prediction of local structures in globular proteins. Journal of Molecular Biology 213: 859–883.
    Smith TF and Waterman MS (1981) Identification of common molecular subsequences. Journal of Molecular Biology 147: 195–197.
    van Gunsteren WF and Berendsen HJC (1990) Computer simulation of molecular dynamics: methodology, applications and perspectives in chemistry. Angewandte Chemie. International Edition in English 29: 992–1023.
    Wang ZX (1998) A re-estimation for the total numbers of protein folds and superfamilies. Protein Engineering 11: 621–626.
    Zhang H (1999) A new hybrid Monte Carlo algorithm for protein potential function test and structure refinement. Proteins 34: 464–471.
    Zhang H, Lai L, Wang L, Han Y and Tang Y (1997) A fast and efficient program for modeling protein loops. Biopolymers 41: 61–72.
 Further Reading
    book Leach AR (1996) Molecular Modelling: Principles and Applications. Essex: Addison Wesley Longman.
    Eisenhaber F, Persson B and Argos P (1995) Protein structure prediction: recognition of primary, secondary, and tertiary structural features from amino acid sequence. Critical Reviews in Biochemistry and Molecular Biology 30: 1–94.
Contact Editor close
Submit a note to the editor about this article by filling in the form below.

* Required Field

How to Cite close
Zhang, Hongyu(Apr 2002) Protein Tertiary Structures: Prediction from Amino Acid Sequences. In: eLS. John Wiley & Sons Ltd, Chichester. http://www.els.net [doi: 10.1038/npg.els.0003040]