The Evolution of Centromeric DNA Sequences


For most eukaryotic species, the centromere is comprised of millions of base pairs of tandemly repeated deoxyribonucleic acid (DNA) sequences. Centromere function is broadly conserved across eukaryotic phyla, yet centromere DNA presents several unique conundrums for biologists, further complicated by the challenges in studying highly repeated regions of complex genomes. Contrary to the expectation that centromeric sequences would be constrained to maintain centromere function across species, these sequences are among the most rapidly evolving sequences in any given genome. This discordance between functional constraint and sequence divergence, termed the ‘centromere paradox’, appears to defy basic laws of Mendelian inheritance. Multiple genetic mechanisms have been proposed to explain centromeric DNA complexity and rapid evolutionary divergence, taking into consideration the unique chromosome architecture and dynamics of the centromere during both mitosis and meiosis. Stochastic processes affecting sequence evolution and the selective constraint necessary for centromere protein recognition are balanced in an ongoing conflict that ultimately manifests as rapid centromere DNA evolution.

Key Concepts:

  • Loss of centromere function does not equate to loss of centromere sequence. Conversely, centromere sequences do not strictly demarcate a functional centromere.

  • The constitution of centromeric satellite DNAs across species differs not only in sequence, but also in the repeat unit length, abundance of the repeat unit and the complexity of the genomic structure of multiple repeat units.

  • The more commonly found regional centromere is typified by a distinct, linear organisation of a given satellite repeat unit into high‐copy tandem repeats of that unit than can span megabases of DNA.

  • The high homology of the higher‐order satellite repeats within a centromere is consistent with their function to effectively bind centromere proteins, wherein selection may favour homogeneity to retain centromere function.

  • Initial forms of an active centromere do not necessarily require HOR satellite arrays, but rather such arrays evolve over evolutionary timescales following stable establishment and inheritance of a new centromere.

  • The process of molecular drive, involving concerted evolution and gene conversion results in the homogenisation and fixation of a given repeat variant and may lead to the convergent and concerted evolution of satellites within one species.

  • Genetic conflict and/or meiotic drive may be responsible for the different centromere satellite sequence suites found between species.

  • Centromere Drive may be responsible for the rapid divergence of functional centromeric sequences between species.

  • The coding sequences for two centromere proteins, CENP‐A (centromere‐specific histone H3) and CENP‐C (a centromere‐specific DNA binding protein), evolve at rates faster than expected for either neutral (unconstrained) or purifying (selectively constrained) evolution in many species.

  • Mobile elements can impact the rate at which satellites are derived, expand, contract and homogenise within a species.

Keywords: centromere inheritance; mutation; molecular drive; genetic conflict; epigenetics; centromere; retroelement; satellite

Figure 1.

Overview of the gross organisation of centromeric and pericentromeric sequences, focusing on human as a reference model. The centromere is identified as the primary constriction on the chromosome flanked by pericentric heterochromatin (middle). A closer look at the centromere shows that there are two types of satellite repeats found at and near human centromeres: monomeric α‐satellite repeats (top) and higher‐order α‐satellite arrays consisting of several monomers repeated as a multimeric unit (bottom). Monomeric repeats are found in ‘domains’ in the pericentric regions and display less homogeneity in sequence identity than higher‐order α‐satellite arrays at the centromere. Frequent insertions of mobile elements disrupt monomeric α‐satellite domains whereas higher‐order α‐satellite arrays may span megabases of DNA.

Figure 2.

Molecular drive model for centromere sequence homogenisation within a lineage. Arrays that exist on different chromosomes (purple and blue) may experience random mutation (yellow). Within‐array conversion proceeds through concerted evolution whereas intra‐array conversion between chromosomes proceeds through conversion and nonhomologous exchange processes. These processes continues to produce both chromosome‐specific HOR arrays and arrays shared amongst many (or all) chromosomes.

Figure 3.

The library model for centromere sequence evolution presumes an ancestor (left) carries many satellites that seed different arrays in diverse lineages at a different rate. Subsequent expansion via molecular drive would lead to larger arrays that appear species‐specific (right).

Figure 4.

‘Centromere drive’ and conflict model of centromere evolution. The sex chromosomes provide an exaggerated example of centromere drive since the X and Y chromosomes do not recombine and the satellites populating each respective centromere are different (shown as different colours). In the initial phase, a centromere variant has a selective advantage by virtue of its ability to attract more microtubules and gain a favourable position in the asymmetric female meiosis (illustrated by more tubules, green). However, this centromere variant results in unequal tension across the centromeres of the paired sex chromosomes in male meiosis and an increase in the rate of nondysjunction in males, resulting in sterility. Likewise, favoured segregation to the egg may be seen in females, otherwise known as meiotic drive. Sterility effects in the male will provide strong selection on any allele that restores meiotic parity and male fertility. Repeated bouts of drive and suppression will be observed as positive selection among centromere or heterochromatin proteins associated with changes in satellite composition.



Allshire RC and Karpen GH (2008) Epigenetic regulation of centromeric chromatin: old dogs, new tricks? Nature Reviews Genetics 9(12): 923–937.

Amor DJ, Bentley K, Ryan J et al. (2004) Human centromere repositioning “in progress”. Proceedings of the National Academy of Sciences of the USA 101(17): 6542–6547.

Avarello R, Pedicini A, Caiulo A, Zuffardi O and Fraccaro M (1992) Evidence for an ancestral alphoid domain on the long arm of human chromosome 2. Human Genetics 89(2): 247–249.

Bouzinba‐Segard H, Guais A and Francastel C (2006) Accumulation of small murine minor satellite transcripts leads to impaired centromeric architecture and function. Proceedings of the National Academy of Sciences of the USA 103(23): 8709–8714.

Bulazel K, Metcalfe C, Ferreri GC et al. (2006) Cytogenetic and molecular evaluation of centromere‐associated DNA sequences from a marsupial (Macropodidae: Macropus rufogriseus) X chromosome. Genetics 172(2): 1129–1137.

Cheng Z, Dong F, Langdon T et al. (2002) Functional rice centromeres are marked by a satellite repeat and a centromere‐specific retrotransposon. Plant Cell 14(8): 1691–1704.

Chueh AC, Northrop EL, Brettingham‐Moore KH, Choo KH and Wong LH (2009) LINE retrotransposon RNA is an essential structural and functional epigenetic component of a core neocentromeric chromatin. PLoS Genetics 5(1): e1000354.

Contreras‐Galindo R, Kaplan MH, He S et al. (2013) HIV infection reveals widespread expansion of novel centromeric human endogenous retroviruses. Genome Research 23(9): 1505–1513.

Dover G (1982) Molecular drive: a cohesive mode of species evolution. Nature 299(5879): 111–117.

Faravelli M, Moralli D, Bertoni L et al. (1998) Two extended arrays of a satellite DNA sequence at the centromere and at the short‐arm telomere of Chinese hamster chromosome 5. Cytogenetics and Cell Genetics 83(3–4): 281–286.

Ferreri GC, Brown JD, Obergfell C et al. (2011) Recent amplification of the kangaroo endogenous retrovirus, KERV, limited to the centromere. Journal of Virology 85(10): 4761–4771.

Haaf T, Warburton PE and Willard HF (1992) Integration of human alpha‐satellite DNA into simian chromosomes: centromere protein binding and disruption of normal chromosome segregation. Cell 70(4): 681–696.

Henikoff S, Ahmad K and Malik HS (2001) The centromere paradox: stable inheritance with rapidly evolving DNA. Science 293(5532): 1098–1102.

Hirsch CD, Wu Y Yan H and Jiang J (2009) Lineage‐specific adaptive evolution of the centromeric protein CENH3 in diploid and allotetraploid Oryza species. Molecular Biology and Evolution 26(12): 2877–2885.

Jiang J, Birchler JA, Parrott WA and Dawe RK (2003) A molecular view of plant centromeres. Trends in Plant Science 8(12): 570–575.

Kapitonov VV and Jurka J (1999) Molecular paleontology of transposable elements from Arabidopsis thaliana. Genetica 107(1–3): 27–37.

Keller C, Kulasegaran‐Shylini R, Shimada Y, Hotz HR and Buhler M (2013) Noncoding RNAs prevent spreading of a repressive histone mark. Nature Structural & Molecular Biology 20(11): 1340.

Koga A, Hirai Y, Terada S et al. (2014) Evolutionary origin of higher‐order repeat structure in alpha‐satellite DNA of primate centromeres. DNA Research 1–9.

Laurent AM, Puechberty J and Roizes G (1999) Hypothesis: for the worst and for the best, L1Hs retrotransposons actively participate in the evolution of the human centromeric alphoid sequences. Chromosome Research 7(4): 305–317.

Malik HS and Henikoff S (2002) Conflict begets complexity: the evolution of centromeres. Current Opinion in Genetics & Development 12(6): 711–718.

Malik HS and Henikoff S (2009) Major evolutionary transitions in centromere complexity. Cell 138(6): 1067–1082.

Maloney KA, Sullivan LL, Matheny JE et al. (2012) Functional epialleles at an endogenous human centromere. Proceedings of the National Academy of Sciences of the USA 109(34): 13704–13709.

Marshall OJ, Chueh AC, Wong LH and Choo KH (2008) Neocentromeres: new insights into centromere structure, disease development, and karyotype evolution. American Journal of Human Genetics 82(2): 261–282.

Melters DP, Bradnam KR, Young HA et al. (2013) Comparative analysis of tandem repeats from hundreds of species reveals unique insights into centromere evolution. Genome Biology 14(1): R10.

Meluh PB, Yang P, Glowczewski L, Koshland D and Smith MM (1998) Cse4p is a component of the core centromere of Saccharomyces cerevisiae. Cell 94(5): 607–613.

Mestrovic N, Plohl M, Mravinac B and Ugarkovic D (1998) Evolution of satellite DNAs from the genus Palorus – experimental evidence for the “library” hypothesis. Molecular Biology and Evolution 15(8): 1062–1068.

Metcalfe CJ, Bulazel KV, Ferreri GC et al. (2007) Genomic instability within centromeres of interspecific marsupial hybrids. Genetics 177(4): 2507–2517.

Miga KH, Newton Y, Jain M et al. (2014) Centromere reference models for human chromosomes X and Y satellite arrays. Genome Research 24(4): 697–707.

Nagaki K, Neumann P, Zhang D et al. (2005) Structure, divergence, and distribution of the CRR centromeric retrotransposon family in rice. Molecular Biology and Evolution 22(4): 845–855.

Nagaki K, Song J, Stupar RM et al. (2003) Molecular and cytological analyses of large tracks of centromeric DNA reveal the structure and evolutionary dynamics of maize centromeres. Genetics 163(2): 759–770.

O'Neill RJ, Eldridge MD and Metcalfe CJ (2004) Centromere dynamics and chromosome evolution in marsupials. Journal of Heredity 95(5): 375–381.

O'Neill RJ, O'Neill MJ and Graves JA (1998) Undermethylation associated with retroelement activation and chromosome remodelling in an interspecific mammalian hybrid. Nature 393(6680): 68–72.

Pironon N, Puechberty J and Roizes G (2010) Molecular and evolutionary characteristics of the fraction of human alpha satellite DNA associated with CENP‐A at the centromeres of chromosomes 1, 5, 19, and 21. BMC Genomics 11: 195.

Ridley M (1993) The Red Queen: Sex and the Evolution of Human Nature. New York, NY: Harper Collins.

Salser W, Bowen S, Browne D et al. (1976) Investigation of the organization of mammalian chromosomes at the DNA sequence level. Federation Proceedings 35(1): 23–35.

Satyaki PR, Cuykendall TN, Wei KH et al. (2014) The HMR and LHR hybrid incompatibility genes suppress a broad range of heterochromatic repeats. PLoS Genetics 10(3): e1004240.

Schueler MG, Dunn JM, Bird CP et al. (2005) Progressive proximal expansion of the primate X chromosome centromere. Proceedings of the National Academy of Sciences of the USA 102(30): 10563–10568.

Schueler MG, Higgins AW, Rudd MK, Gustashaw K and Willard HF (2001) Genomic and genetic definition of a functional human centromere. Science 294(5540): 109–115.

Schueler MG and Sullivan BA (2006) Structural and functional dynamics of human centromeric chromatin. Annual Review of Genomics and Human Genetics 7: 301–313.

Slee RB, Steiner CM, Herbert BS et al. (2012) Cancer‐associated alteration of pericentromeric heterochromatin may contribute to chromosome instability. Oncogene 31(27): 3244–3253.

Sullivan BA, Blower MD and Karpen GH (2001) Determining centromere identity: cyclical stories and forking paths. Nature Reviews Genetics 2(8): 584–596.

Talbert PB, Bryson TD and Henikoff S (2004) Adaptive evolution of centromere proteins in plants and animals. Journal of Biology 3(4): 18.

Talbert PB, Masuelli R, Tyagi AP, Comai L and Henikoff S (2002) Centromeric localization and adaptive evolution of an Arabidopsis histone H3 variant. Plant Cell 14(5): 1053–1066.

Thomae AW, Schade GO, Padeken J et al. (2013) A pair of centromeric proteins mediates reproductive isolation in Drosophila species. Developmental Cell 27(4): 412–424.

Thompson‐Stewart D, Karpen GH and Spradling AC (1994) A transposable element can drive the concerted evolution of tandemly repetitious DNA. Proceedings of the National Academy of Sciences of the USA 91(19): 9042–9046.

Wade CM, Giulotto E, Sigurdsson S et al. (2009) Genome sequence, comparative analysis, and population genetics of the domestic horse. Science 326(5954): 865–867.

Wang K, Wu Y, Zhang W, Dawe RK and Jiang J (2014) Maize centromeres expand and adopt a uniform size in the genetic background of oat. Genome Research 24(1): 107–116.

Wang L, Zeng Z, Zhang W and Jiang J (2014) Three potato centromeres are associated with distinct haplotypes with or without megabase‐sized satellite repeat arrays. Genetics 196(2): 397–401.

Williamson SH, Hubisz MJ, Clark AG et al. (2007) Localizing recent adaptive evolution in the human genome. PLoS Genetics 3(6): e90.

Wong LH and Choo KH (2004) Evolutionary dynamics of transposable elements at the centromere. Trends in Genetics 20(12): 611–616.

Zedek F and Bures P (2012) Evidence for centromere drive in the holocentric chromosomes of Caenorhabditis. PLoS One 7(1): e30496.

Zhong CX, Marshall JB, Topp C et al. (2002) Centromeric retroelements and satellites interact with maize kinetochore protein CENH3. Plant Cell 14(11): 2825–2836.

Zwick ME, Salstrom JL and Langley CH (1999) Genetic variation in rates of nondisjunction: association of two naturally occurring polymorphisms in the chromokinesin nod with increased rates of nondisjunction in Drosophila melanogaster. Genetics 152(4): 1605–1614.

Further Reading

Arkhipova FR (2013) Genetic and epigenetic changes involving (retro)transposons in animal hybrids and polyploids. Cytogenetic and Genome Research 140: 295–311.

Brown JD and O'Neill RJ (2010) Chromosomes, conflict, and epigenetics: chromosomal speciation revisited. Annual Review of Genomics and Human Genetics 11: 291–316.

Ekwall K (2007) Epigenetic control of centromere behavior. Annual Review of Genetics 41: 63–81.

Feschotte C and Gilbert C (2012) Endogenous viruses: insights into viral evolution and impact on host biology. Nature Reviews Genetics 13: 283–296.

Hayden KE, Strome ED, Merrett SL et al. (2013) Sequences associated with centromere competency in the human genome. Molecular and Cellular Biology 33(4): 763–772.

Levin HL and Moran JV (2011) Dynamic interactions between transposable elements and their hosts. Nature Reviews Genetics 12: 615–627.

Malik HS and Henikoff S (2009) Major evolutionary transitions in centromere complexity. Cell 138(6): 1067–1082.

Roy B, and Sanyal K (2011) Diversity in requirement of genetic and epigenetic factors for centromere function in fungi. Eukaryotic Cell 10:1384–1395.

Scott KC and Sullivan BA (2014) Neocentromeres: a place for everything and everything in its place. Trends in Genetics 30(2): 66–74.

Talbert PB and Henikoff S (2010) Centromeres convert but don't cross. PLoS Biology 8: e1000326.

Contact Editor close
Submit a note to the editor about this article by filling in the form below.

* Required Field

How to Cite close
Brown, Judith D, and O'Neill, Rachel J(Sep 2014) The Evolution of Centromeric DNA Sequences. In: eLS. John Wiley & Sons Ltd, Chichester. [doi: 10.1002/9780470015902.a0020827.pub2]