Mutation Rate of Non‐CpG DNA


Base substitutions (mutations) change the informational content of deoxyribonucleic acid (DNA). Therefore, understanding the determinants of the base mutation rate is a fundamental problem in biology. Biochemically, mutations are a function of the fidelity of DNA replication, damage and repair. Each can be affected by nucleotide composition and context. However, various downstream factors determine the ultimate survival of these mutations in a population, including natural selection and chance (genetic drift). The result in mammals is two dramatically different mutation rates. The C of most CpG dinucleotides mutates approximately 10‐ to 50‐fold faster than Cs in other contexts or any other nucleotide (i.e. in non‐CpG sites). But, the non‐CpG mutation rate also varies between different genomic regions. Here we address the difference between CpG and non‐CpG mutation rates and the factors that are correlated with the variance in non‐CpG rates. A predominant one is CpG content.

Key concepts:

  • Evolutionary processes determine the extent to which the base substitutions (mutations), which result from biochemical errors inherent in the processes that replicate and maintain the integrity of the genome, contribute to the mutation rate experienced by a population.

  • These biochemical mutations are a function of the fidelity of DNA replication, the efficiency with which replication errors and DNA damage are repaired and whether DNA repair processes themselves produce errors in undamaged DNA.

Keywords: mutation; CpG; DNA‐repair; evolution; methylation

Figure 1.

Using copies of extinct L1 retrotransposon families to measure the divergence between chimpanzee and humans. (a) The left side shows the phylogenetic relationship between rhesus monkey (Macaca mulatta, M), chimpanzee (Pan troglodytes, P) and human (Homo sapiens, H). The right side lists the five primate‐specific L1 families (L1 Pa3–L1 Pa5), and the grey bars indicate the estimated time during which they were active, in the common ancestor of these species. The double‐headed arrows (T3–T7) indicate the time between the peak activity of these families and the approximate time of the chimpanzee – human divergence. The double‐headed arrow (Tp−h), indicates the time between this divergence and the present. (b) The y‐axis indicates the percentage of CpG and TpG(CpA) at corresponding positions relative to the amounts in the full‐length reconstructed ancestral sequences of each family. Reproduced and modified with permission form Cold Spring Harbor Laboratory Press © 2009 (Walser et al., ).

Figure 2.

The divergence of orthologous L1 inserts in chimpanzee and humans (A) Left y‐axis – divergence of autosomal orthologues of different L1 families in chimpanzee and humans. Right y‐axes – the % recombination of the orthologue‐containing regions and the % C+G content of either the L1 orthologue or their flanking DNA. (B) The % mutations at non‐CpG site in orthologues of various families as a function of the mutations at CpG sites. Reproduced and modified with permission form Cold Spring Harbor Laboratory Press © 2009 (Walser et al., ).



Arndt PF and Hwa T (2005) Identification and measurement of neighbor‐dependent nucleotide substitution processes. Bioinformatics 21(10): 2322–2328.

Asthana S, Schmidt S and Sunyaev S (2005) A limited role for balancing selection. Trends in Genetics 21(1): 30–32.

Bejerano G, Pheasant M, Makunin I et al. (2004) Ultraconserved elements in the human genome. Science 304(5675): 1321–1325.

Bohossian HB, Skaletsky H and Page DC (2000) Unexpected similar rates of nucleotide substitution found in male and female hominids. Nature 406: 622–625.

Boissinot S, Chevret P and Furano AV (2000) L1 (LINE‐1) retrotransposon evolution and amplification in recent human history. Molecular Biology and Evolution 17(6): 915–928.

Brown TC and Jiricny J (1988) Different base/base mispairs are corrected with different efficiencies and specificities in monkey kidney cells. Cell 54(5): 705–711.

Bulmer M (1986) Neighboring base effects on substitution rates in pseudogenes. Molecular Biology and Evolution 3(4): 322–329.

Cantrell MA, Scott L, Brown CJ et al. (2008) Loss of LINE‐1 activity in the megabats. Genetics 178(1): 393–404.

Chimpanzee‐Sequencing‐Analysis‐Consortium (2005) Initial sequence of the chimpanzee genome and comparison with the human genome. Nature 437(7055): 69–87.

Cooper DM, Schimenti KJ and Schimenti JC (1998) Factors affecting ectopic gene conversion in mice. Mammalian Genome 9(5): 355–360.

Cooper DN and Youssoufian H (1988) The CpG dinucleotide and human genetic disease. Human Genetics 78(2): 151–155.

Coulondre C, Miller JH, Farabaugh PJ et al. (1978) Molecular basis of base substitution hotspots in Escherichia coli. Nature 274(5673): 775–780.

Duncan BK and Miller JH (1980) Mutagenic deamination of cytosine residues in DNA. Nature 287(5782): 560–561.

Duret L (2009) Mutation patterns in the human genome: more variable than expected. PLoS Biology 7(2): e1000028.

Ehrlich M and Wang RY (1981) 5‐Methylcytosine in eukaryotic DNA. Science 212(4501): 1350–1357.

Ellegren H (2007) Characteristics, causes and evolutionary consequences of male‐biased mutation. Proceedings Biological Sciences 274(1606): 1–10.

ENCODE‐Project‐Consortium (2007) Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447(7146): 799–816.

Gaffney DJ and Keightley PD (2005) The scale of mutational variation in the murid genome. Genome Research 15(8): 1086–1094.

Gaffney DJ and Keightley PD (2008) Effect of the assignment of ancestral CpG state on the estimation of nucleotide substitution rates in mammals. BMC Evolutionary Biology 8: 265.

Green P, Ewing B, Miller W et al. (2003) Transcription‐associated mutational asymmetry in mammalian evolution. Nature Genetics 33(4): 514–517.

Hanawalt PC and Spivak G (2008) Transcription‐coupled DNA repair: two decades of progress and surprises. Nature Reviews Molecular Cell Biology 9(12): 958–970.

Hardison RC, Roskin KM, Yang S et al. (2003) Covariation in frequencies of substitution, deletion, transposition, and recombination during eutherian evolution. Genome Research 13(1): 13–26.

Hellmann I, Prufer K, Ji H et al. (2005) Why do human diversity levels vary at a megabase scale? Genome Research 15(9): 1222–1231.

Hendrich B, Hardeland U, Ng HH et al. (1999) The thymine glycosylase MBD4 can bind to the product of deamination at methylated CpG sites. Nature 401(6750): 301–304.

Hodgkinson A, Ladoukakis E and Eyre‐Walker A (2009) Cryptic variation in the human mutation rate. PLoS Biology 7(2): e1000027.

Huttley GA, Jakobsen IB, Wilson SR et al. (2000) How important is DNA replication for mutagenesis? Molecular Biology and Evolution 17(6): 929–937.

Huvet M, Nicolay S, Touchon M et al. (2007) Human gene organization driven by the coordination of replication and transcription. Genome Research 17(9): 1278–1285.

Hwang DG and Green P (2004) Bayesian Markov chain Monte Carlo sequence analysis reveals varying neutral substitution patterns in mammalian evolution. Proceedings of the National Academy of Sciences of the USA 101(39): 13994–134001.

Karro JE, Peifer M, Hardison RC et al. (2008) Exponential decay of GC content detected by strand‐symmetric substitution rates influences the evolution of isochore structure. Molecular Biology and Evolution 25(2): 362–374.

Ke S, Zhang XH and Chasin LA (2008) Positive selection acting on splicing motifs reflects compensatory evolution. Genome Research 18(4): 533–543.

Kehrer‐Sawatzki H and Cooper DN (2007) Understanding the recent evolution of the human genome: insights from human‐chimpanzee genome comparisons. Human Mutation 28(2): 99–130.

Keightley PD and Gaffney DJ (2003) Functional constraints and frequency of deleterious mutations in noncoding DNA of rodents. Proceedings of the National Academy of Sciences of the USA 100(23): 13402–13406.

Khaitovich P, Kelso J, Franz H et al. (2006) Functionality of Intergenic Transcription: An Evolutionary Comparison. PLoS Genetics 2(10): e171.

Krawczak M, Ball EV and Cooper DN (1998) Neighboring‐nucleotide effects on the rates of germ‐line single‐base‐pair substitution in human genes. American Journal of Human Genetics 63(2): 474–488.

Li WH, Yi S and Makova K (2002) Male‐driven evolution. Current Opinion in Genetics & Development 12(6): 650–656.

Makova KD and Li W‐H (2002) Strong male‐driven evolution of DNA sequences in humans and apes. Nature 416: 624–626.

Meunier J and Duret L (2004) Recombination drives the evolution of GC‐content in the human genome. Molecular Biology and Evolution 21(6): 984–990.

Millar CB, Guy J, Sansom OJ et al. (2002) Enhanced CpG mutability and tumorigenesis in MBD4‐deficient mice. Science 297(5580): 403–405.

Miyata T, Hayashida H, Kuma K et al. (1987) Male‐driven molecular evolution: a model and nucleotide sequence analysis. Cold Spring Harbor Symposia on Quantative Biology 52: 863–867.

Mugal CF, von Grunberg HH and Peifer M (2009) Transcription‐induced mutational strand bias and its effect on substitution rates in human genes. Molecular Biology and Evolution 26(1): 131–142.

Myers S, Bottolo L, Freeman C et al. (2005) A fine‐scale map of recombination rates and hotspots across the human genome. Science 310(5746): 321–324.

Nachman MW and Crowell SL (2000) Estimate of the mutation rate per nucleotide in humans. Genetics 156(1): 297–304.

Patterson N, Richter DJ, Gnerre S et al. (2006) Genetic evidence for complex speciation of humans and chimpanzees. Nature 441(7097): 1103–1108.

Pheasant M and Mattick JS (2007) Raising the estimate of functional human sequences. Genome Research 17(9): 1245–1253.

Polak P and Arndt PF (2008) Transcription induces strand‐specific mutations at the 5’ end of human genes. Genome Research 18(8): 1216–1223.

Ponting CP and Lunter G (2006) Signatures of adaptive evolution within human non‐coding sequence. Human Molecular Genetics 15(Spec No 2): R170–R175.

Ray DA, Feschotte C, Pagan HJ et al. (2008) Multiple waves of recent DNA transposon activity in the bat, Myotis lucifugus. Genome Research 18(5): 717–728.

Robertson KD and Jones PA (1997) Dynamic interrelationships between DNA replication, methylation, and repair. American Journal of Human Genetics 61(6): 1220–1224.

Shen JC, Rideout WM 3rd and Jones PA (1994) The rate of hydrolytic deamination of 5‐methylcytosine in double‐stranded DNA. Nucleic Acids Research 22(6): 972–976.

Siepel A, Bejerano G, Pedersen JS et al. (2005) Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Research 15(8): 1034–1050.

Spencer CCA, Deloukas P, Hunt S et al. (2006) The Influence of Recombination on Human Genetic Diversity. PLoS Genetics 2(9): e148.

Sved J and Bird A (1990) The expected equilibrium of the CpG dinucleotide in vertebrate genomes under a mutation model. Proceedings of the National Academy of Sciences of the USA 87(12): 4692–4696.

Taylor J, Tyekucheva S, Zody M et al. (2006) Strong and weak male mutation bias at different sites in the primate genomes: insights from the human‐chimpanzee comparison. Molecular Biology and Evolution 23(3): 565–573.

Tian D, Wang Q, Zhang P et al. (2008) Single‐nucleotide mutation rate increases close to insertions/deletions in eukaryotes. Nature 455(7209): 105–108.

Tyekucheva S, Makova KD, Karro JE et al. (2008) Human‐macaque comparisons illuminate variation in neutral substitution rates. Genome Biology 9(4): R76.

Vogel F and Rathenberg R (1975) Spontaneous mutation in man. Advances in Human Genetics 5: 223–318.

Walser JC, Ponger L and Furano AV (2008) CpG dinucleotides and the mutation rate of non‐CpG DNA. Genome Research 18(9): 1403–1414.

Wiebauer K and Jiricny J (1989) In vitro correction of G.T mispairs to G.C pairs in nuclear extracts from human cells. Nature 339(6221): 234–236.

Further Reading

Furano AV (2000) The biological properties and evolutionary dynamics of mammalian LINE‐1 retrotransposons. Progress in Nucleic Acids Research & Molecular Biology 64: 255–294.

Goodman MF (2002) Error‐prone repair DNA polymerases in prokaryotes and eukaryotes. Annual Review of Biochemistry 71: 17–50.

IHGS‐Consortium (2001) Initial sequencing and analysis of the human genome. Nature 409(6822): 860–921.

Khan H, Smit A and Boissinot S (2006) Molecular evolution and tempo of amplification of human LINE‐1 retrotransposons since the origin of primates. Genome Research 16(1): 78–87.

Kimura M (1968) Evolutionary rate at the molecular level. Nature 217(5129): 624–626.

Kondrashov FA, Ogurtsov AY and Kondrashov AS (2006) Selection in favor of nucleotides G and C diversifies evolution rates and levels of polymorphism at mammalian synonymous sites. Journal of Theoretical Biology 240(4): 616–626.

Rattray AJ and Strathern JN (2003) Error‐prone DNA polymerases: when making a mistake is the only way to get ahead. Annual Review of Genetics 37: 31–66.

Walsh CP and Xu GL (2006) Cytosine methylation and DNA repair. Current Topics in Microbiology and Immunology 301: 283–315.

Contact Editor close
Submit a note to the editor about this article by filling in the form below.

* Required Field

How to Cite close
Furano, Anthony V, and Walser, Jean‐Claude(Dec 2009) Mutation Rate of Non‐CpG DNA. In: eLS. John Wiley & Sons Ltd, Chichester. [doi: 10.1002/9780470015902.a0021740]