Functional Constraint and Molecular Evolution


While having one or more specific functions, macromolecules have collective functions (e.g. Donnan equilibrium and aggregation pressure), and general functions (e.g. contribution to organism weight). Successful molecular evolution requires an appropriate balance between the constraints on these functions, which arise from selective pressures acting at the levels of conventional phenotypes (natural selection) and genome phenotypes (reprotypic selection). Genome‐wide constraints include fold pressure (nucleic acid stem‐loop extrusion pressure) and GC‐pressure (the pressure for a certain base composition). When these bring about within‐genome reprotypic selection (hybrid sterility), there is the potential for new species to emerge (speciation). Local constraints include protein pressure (the pressure to encode a protein) and purine‐loading pressure (purine‐rich messenger ribonucleic acid (mRNA) synonymous strands). As more pressures are identified, arguments for neutral evolution weaken.

Key Concepts:

  • Molecules have specific, collective and general functions.

  • Most macromolecules function by virtue of their higher ordered structure.

  • Nucleic acids have both structural and templating functions.

  • Each species achieves its own balance between the competing demands (constraints) of external and internal environments.

  • The organismal phenotype comprises the classical phenotype and the genome phenotype.

  • Natural selection operates on the classical phenotype.

  • Reprotypic selection operates on the genome phenotype.

  • By balancing natural and reprotypic selection mechanisms, the ‘hand of nature’ resolves conflicts between functions.

Keywords: conflict resolution; degenerate code; GC rule; neutralism; purine‐loading; speciation

Figure 1.

Distribution of purine‐loading among biological species. Purine‐loading of coding regions was calculated from codon usage tables for all species represented in the August 1999 release of the GenBank database by more than three genes or more than 2500 bases. The purine‐loading index (bases kb−1) for a particular species was calculated as the sum of 1000((G–C)/N) and 1000((A–T)/N), where G, C, A and T correspond to the number of individual bases, and N corresponds to the total number of bases, in the codon usage table. This measure of the purine‐loading of RNAs disregards 5′ and 3′ noncoding sequences, including poly(A) tails. The value for all human genes (excluding mitochondria) is 42 Kb−1, meaning that, on average, there are 42 more purines than pyrimidines for every kilobase of coding sequence. The shoulder with negative purine‐loading values (i.e. pyrimidine‐loading) corresponds mainly to mitochondrial genes.

Figure 2.

Szybalski's transcription direction rule evaluated as ‘Chargaff differences’ (deviations from Chargaff's second parity rule). Heavy horizontal arrows refer to the ‘top’ and ‘bottom’ strands of duplex DNA. Grey boxes refer to intergenic DNA. Green circles represent RNA polymerases with red arrows indicating the direction of transcription. In the case of leftward transcription the Chargaff difference for the top strand is in favour of pyrimidines (Y). In the case of rightward transcription the Chargaff difference for the top strand is in favour of purines (R). RNAs tend to locate purines as clusters in the loop regions of their secondary structure.

Figure 3.

Summary of potentially conflicting evolutionary pressures as manifest at the level of mRNA (purple line with arrowhead). Boxes indicate the domains over which different pressures operate. (1) (G+C)% pressure (‘GC pressure’) acting primarily at the genomic level, and secondarily affecting mRNA base composition. (2) Fold (stem–loop) pressure acting primarily at the genomic level and secondarily affecting mRNA base order and composition. (3) Purine‐loading pressure acting primarily at the cytoplasmic level to enrich loops with purines. (4) Protein‐encoding pressure derived from environmental interactions (natural selection) relating to specific, collective and general protein functions, which result in base changes in the protein‐encoding part of the mRNA. (5) Regulatory pressures (small lilac boxes) acting primarily at the cytoplasmic level, which result in base changes mainly in the 5′ and 3′ noncoding regions.



Ball LA (1973) Secondary structure and coding potential of the coat protein gene of bacteriophage MS2. Nature New Biology 242: 44–45.

Bernardi G and Bernardi G (1986) Compositional constraints and genome evolution. Journal of Molecular Evolution 24: 1–11.

Bertsch C, Beuve M, Dolja VV et al. (2009) Retention of virus‐derived sequences in the nuclear genome of grapevine as a potential pathway to virus resistance. Biology Direct 4: 21.

Bull JJ, Jacobson A, Badgett MR and Molineux IJ (1998) Viral escape from antisense RNA. Molecular Microbiology 28: 835–846.

Chargaff E (1963) Essays on Nucleic Acids. Amsterdam: Elsevier.

Danilowicz C, Lee CH, Kim K et al. (2009) Single molecule detection of direct, homologous, DNA/DNA pairing. Proceedings of the National Academy of Sciences of the USA 106: 19824–19829.

Dawkins R (1976) The Selfish Gene. Oxford: Oxford University Press.

Dawson WK and Yamamoto K (1999) Mean free energy topology for nucleotide sequences of varying composition based on secondary structure calculations. Journal of Theoretical Biology 201: 113–140.

Derrien T, Guigo R and Johnson R (2012) The long non‐coding RNAs: a new player in the ‘dark matter’. Frontiers in Genetics 2: 107.

Eguchi Y, Itoh T and Tomizawa J (1991) Antisense RNA. Annual Reviews of Biochemistry 60: 631–652.

Flegel TW (2009) Hypothesis for hereditable, antiviral immunity in crustaceans and insects. Biology Direct 4: 32.

Forsdyke DR (1994) Relationship of X chromosome dosage compensation to intracellular self/not‐self discrimination: a resolution of Muller's paradox. Journal of Theoretical Biology 167: 7–12.

Forsdyke DR (1995a) Entropy‐driven protein self‐aggregation as the basis for self/not‐self discrimination in the crowded cytosol. Journal of Biological Systems 3: 273–287.

Forsdyke DR (1995b) Conservation of stem–loop potential in introns of snake venom phospholipase A2 genes. An application of FORS‐D analysis. Molecular Biology and Evolution 12: 1157–1165.

Forsdyke DR (1999) Two levels of information in DNA. Relationship of Romanes’ ‘intrinsic’ variability of the reproductive system, and Bateson's ‘residue’ to the species‐dependent component of the base composition, (C+G)%. Journal of Theoretical Biology 201: 47–61.

Forsdyke DR (2003) William Bateson, Richard Goldschmidt, and non‐genic modes of speciation. Journal of Biological Systems 11: 341–350.

Forsdyke DR (2004a) Chromosomal speciation: a reply. Journal of Theoretical Biology 230: 189–196.

Forsdyke DR (2004b) Regions of relative GC% uniformity are recombinational isolators. Journal of Biological Systems 12: 261–271.

Forsdyke DR (2007) Molecular sex: the importance of base composition rather than homology when nucleic acids hybridize. Journal of Theoretical Biology 249: 325–330.

Forsdyke DR (2009) X‐chromosome reactivation perturbs intracellular self/not‐self discrimination. Immunology and Cell Biology 87: 525–528.

Forsdyke DR (2010) George Romanes, William Bateson, and Darwin's ‘Weak Point’. Notes and Records of the Royal Society 64: 139–154.

Forsdyke DR (2011) The selfish gene revisited: reconciliation of Williams‐Dawkins and conventional definitions. Biological Theory 5: 246–255.

Forsdyke DR and Bell SJ (2004) Purine‐loading, stem–loops, and Chargaff's second parity rule: a discussion of the application of elementary principles to early chemical observations. Applied Bioinformatics 3: 3–8.

Forsdyke DR, Madill CA and Smith SD (2002) Immunity as a function of the unicellular state: implications of emerging genomic data. Trends in Immunology 23: 575–579.

Goldschmidt R (1940) The Material Basis of Evolution. New Haven: Yale University Press.

Gulick JT (1872) On diversity of evolution under one set of external conditions. Journal of the Linnean Society (Zoology) 11: 496–505.

Hooker JD (1862) Letter to Mr. Bates. In: Burkhardt F, Porter DM, Harvey J and Topham JR (eds) The Correspondence of Charles Darwin, vol. 10, pp. 127–130. Cambridge: Cambridge University Press.

Hughes AL and Hughes MK (1995) Small genomes for better flyers. Nature 377: 391.

Kornyshev AA (2010) Physics of DNA: unraveling hidden abilities encoded in the structure of ‘the most important molecule’. Physical Chemistry Chemical Physics 12: 12352–12378.

Lambros RJ, Mortimer JR and Forsdyke DR (2003) Optimum growth temperature and the base composition of open reading frames in prokaryotes. Extremophiles 7: 443–450.

Makarova KS, Haft DH, Barrangou R et al. (2011) Evolution and classification of CRISP‐Cas systems. Nature Reviews Microbiology 9: 467–477.

Marcotte EM and Tsechansky M (2009) Disorder, promiscuity and toxic partnerships. Cell 138: 16–18.

McConkey AH (1982) Molecular evolution, intracellular organization, and the quinary structure of proteins. Proceedings of the National Academy of Sciences of the USA 79: 3236–3240.

Naveira HF and Maside XR (1998) The genetics of hybrid male sterility in Drosophila. In: Howard DJ and Berlocher SH (eds) Endless Forms: Species and Speciation, pp. 330–338. New York: Oxford University Press.

Pandey R and Mukerji M (2011) From ‘junk’ to just unexplored noncoding knowledge: the case of transcribed Alus. Briefings in Functional Genomics 10: 294–311.

Paz A, Mester D, Nevo A and Karol A (2004) Adaptive role of increased frequency of polypurine tracts in mRNA sequences of thermophilic prokaryotes. Proceedings of the National Academy of Sciences of the USA 101: 2951–2956.

Rajan RS, Illing ME, Bence NF and Kopito RR (2001) Specificity in intracellular protein aggregation and inclusion body formation. Proceedings of the National Academy of Sciences of the USA 98: 13060–13065.

Rocha EPC, Danchin A and Viari A (1999) Universal replication biases in bacteria. Molecular Microbiology 32: 11–16.

Seffens W and Digby D (1999) mRNAs have greater negative folding free energies than shuffled or codon choice randomized sequences. Nucleic Acids Research 27: 1578–1584.

Smithies O, Engels WR, Devereux JR, Slightom JL and Shen S (1981) Base substitutions, length differences and DNA strand asymmetries in the human Gλ and Aλ fetal globin gene region. Cell 26: 345–353.

Sueoka N (1961) Compositional correlations between deoxyribonucleic acid and protein. Cold Spring Harbor Symposium in Quantitative Biology 26: 35–43.

Sueoka N (1962) On the genetic basis of variation and heterozygosity of DNA base composition. Proceedings of the National Academy of Sciences of the USA 48: 582–592.

Szybalski W, Kubinski H and Sheldrick P (1966) Pyrimidine clusters on the transcribing strands of DNA and their possible role in the initiation of RNA synthesis. Cold Spring Harbor Symposium in Quantitative Biology 31: 123–127.

Theofilopoulos AN, Kono DH, Beutler B and Baccala R (2011) Intracellular nucleic acid sensors and autoimmunity. Journal of Interferon and Cytokine Research 31: 1–20.

Tian B, White RJ, Xia T et al. (2000) Expanded CUG repeat RNAs form hairpins that activate the double‐strand RNA‐dependent protein kinase PKR. RNA 6: 79–87.

Wilkins C, Dishongh R, Moore SC et al. (2005) RNA interference is an antiviral defence mechanism in Caenorhabditis elegans. Nature 436: 1044–1047.

Winge Ö (1917) The chromosomes, their number and general importance. Comptes Rendus Travails Laboratoire Carlsberg 13: 131–275.

Xue HY and Forsdyke DR (2003) Low complexity segments in Plasmodium falciparum proteins are primarily nucleic acid level adaptations. Molecular Biochemistry and Parasitology 128: 21–32.

Yang J‐R, Liao B‐Y, Zhuang S‐M and Zhang J (2012) Protein misinteraction avoidance causes highly expressed proteins to evolve slowly. Proceedings of the National Academy of Sciences of the USA 109(14): E831–E840. Published ahead of print March 13, 2012, doi:10.1073/pnas.1117408109.

Further Reading

Bernstein C and Bernstein H (1991) Aging, Sex and DNA Repair. San Diego, CA: Academic Press.

Cock AG and Forsdyke DR (2008) Treasure Your Exceptions. The Science and Life of William Bateson. New York: Springer.

Fersht A (1998) Structure and Mechanism in Protein Science. San Francisco, CA: WH Freeman.

Forsdyke DR (2001) The Origin of Species, Revisited. A Victorian who Anticipated Modern Developments in Darwin's Theory. Montreal: McGill‐Queen's University Press.

Forsdyke DR (2011) Evolutionary Bioinformatics, 2nd edn. New York: Springer.

Forsdyke DR (2012) Evolution Academy [∼forsdyke/videolectures.htm].

Forsdyke DR and Mortimer JR (2000) Chargaff's legacy. Gene 261: 127–137.

Fulton AB (1982) How crowded is the cytoplasm? Cell 30: 345–347.

Gould SJ (2002) The Structure of Evolutionary Theory. Cambridge: Harvard University Press.

Hawley RS and Arbel T (1993) Yeast genetics and the fall of the classical view of meiosis. Cell 72: 301–303.

Hitchcock DI (1924) Proteins and the Donnan equilibrium. Physiological Reviews 4: 505–531.

Kimura M (1983) The Neutral Theory of Molecular Evolution. Cambridge: Cambridge University Press.

Lauffer MA (1975) Entropy‐Driven Processes in Biology. New York: Springer.

Wada A, Suyama A and Hanai R (1991) Phenomenological theory of GC/AT pressure on DNA base composition. Journal of Molecular Evolution 32: 374–378.

Williams GC (1966) Adaptation and Natural Selection. Princeton, NJ: Princeton University Press.

Contact Editor close
Submit a note to the editor about this article by filling in the form below.

* Required Field

How to Cite close
Forsdyke, Donald R(Jul 2012) Functional Constraint and Molecular Evolution. In: eLS. John Wiley & Sons Ltd, Chichester. [doi: 10.1002/9780470015902.a0001804.pub3]