The Biological Significance of Conserved Nongenic DNA

Abstract

Although the completion of the human genome was a landmark in genomics, the availability of a single genome did not offer the opportunity to detect most of the functional part of the genome. With the completion of the mouse genome and the subsequent release of additional mammalian and other vertebrate genomes, a large number of conserved nonprotein‐coding sequences were discovered. The most surprising discovery was that the nongenic fraction of the genome that is under purifying selection is larger than the coding fraction of the genome. In this article we review some of the key discoveries in the field and discuss methodologies for the computational and functional interpretation of nongenic deoxyribonucleic acid (DNA).

Keywords: comparative genomics; evolution; non‐coding DNA

Figure 1.

(a) A CNG that is highly conserved across multiple species. (b) A CNG that shows nucleotides of high and low conservations.

Figure 2.

PWM of the MEF2 transcription factor in mammals.

Figure 3.

A region of human chromosome 13 from a UCSC browser view. Two of the tracks show high levels of vertebrate and mammalian conservation in a region devoid of genes.

Figure 4.

Model of a regulatory module with transcription factor binding sites (TFBSs), and the gene they regulate.

close

References

Ahituv N, Zhu Y, Visel A et al. (2007) Deletion of ultraconserved elements yields viable mice. PLoS Biology 5: e234.

Aparicio S, Chapman J, Stupka E et al. (2002) Whole‐genome shotgun assembly and analysis of the genome of Fugu rubripes. Science 297: 1301–1310.

Arnosti DN, Barolo S, Levine M and Small S (1996) The eve stripe 2 enhancer employs multiple modes of transcriptional synergy. Development 122: 205–214.

Bejerano G, Pheasant M, Makunin I et al. (2004) Ultraconserved elements in the human genome. Science 304: 1321–1325.

Bell AC, West AG and Felsenfeld G (2001) Insulators and boundaries: versatile regulatory elements in the eukaryotic. Science 291: 447–450.

Birney E, Stamatoyannopoulos JA, Dutta A et al. (2007) Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447: 799–816.

Boffelli D, McAuliffe J, Ovcharenko D et al. (2003) Phylogenetic shadowing of primate sequences to find functional regions of the human genome. Science 299: 1391–1394.

Bray N, Dubchak I and Pachter L (2003) AVID: a global alignment program. Genome Research 13: 97–102.

Bray N and Pachter L (2003) MAVID multiple alignment server. Nucleic Acids Research 31: 3525–3526.

Brudno M, Do CB, Cooper GM et al. (2003) LAGAN and Multi‐LAGAN: efficient tools for large‐scale multiple alignment of genomic DNA. Genome Research 13: 721–731.

Cavalli G (2007) Chromosome kissing. Current Opinion in Genetics & Development 17: 443–450.

Cooper GM, Brudno M, Green ED, Batzoglou S and Sidow A (2003) Quantitative estimates of sequence divergence for comparative analyses of mammalian genomes. Genome Research 13: 813–820.

Dermitzakis ET, Bergman C and Clark AG (2003) Tracing the evolutionary history of Drosophila regulatory regions with models that identify transcription factor binding sites. Molecular Biology and Evolution 20: 703–714.

Dermitzakis ET and Clark AG (2002) Evolution of transcription factor binding sites in mammalian gene regulatory regions: conservation and turnover. Molecular Biology and Evolution 19: 1114–1121.

Dermitzakis ET, Reymond A and Antonarakis SE (2005) Conserved non‐genic sequences – an unexpected feature of mammalian genomes. Nature Reviews. Genetics 6: 151–157.

Dermitzakis ET, Reymond A, Lyle R et al. (2002) Numerous potentially functional but non‐genic conserved sequences on human chromosome 21. Nature 420: 578–582.

Dermitzakis ET, Kirkness E, Schwartz S et al., (2004) Comparison of human chromosome 21 conserved non‐genic sequences (CNGs) with the mouse and dog genomes shows that their selective constraint is independent of their genic environment. Genome Research 14: 852–859.

Drake JA, Bird C, Nemesh J et al. (2006) Conserved noncoding sequences are selectively constrained and not mutation cold spots. Nature Genetics 38: 223–227.

Elnitski L, Hardison RC, Li J et al. (2003) Distinguishing regulatory DNA from neutral sites. Genome Research 13: 64–72.

ENCyclopedia Of DNA Elements (2004) The ENCODE (ENCyclopedia Of DNA Elements) project. Science 306: 636–640.

Ettwiller L, Paten B, Souren M et al. (2005) The discovery, positioning and verification of a set of transcription‐associated motifs in vertebrates. Genome Biology 6: R104.

Frazer KA, Sheehan JB, Stokowski RP et al. (2001) Evolutionarily conserved sequences on human chromosome 21. Genome Research 11: 1651–1659.

Frazer KA, Tao H, Osoegawa K et al. (2004) Noncoding sequences conserved in a limited number of mammals in the SIM2 interval are frequently functional. Genome Research 14: 367–372.

Gibbs RA, Rogers J, Katze MG et al. (2007) Evolutionary and biomedical insights from the rhesus macaque genome. Science 316: 222–234.

Gibbs RA, Weinstock GM, Metzker ML et al. (2004) Genome sequence of the Brown Norway rat yields insights into mammalian evolution. Nature 428: 493–521.

Hardison RC (2000) Conserved noncoding sequences are reliable guides to regulatory elements. Trends in Genetics 16: 369–372.

Hillier LW, Miller W, Birney E et al. (2004) Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution. Nature 432: 695–716.

Jaillon O, Aury JM, Brunet F et al. (2004) Genome duplication in the teleost fish Tetraodon nigroviridis reveals the early vertebrate proto‐karyotype. Nature 431: 946–957.

Katzman S, Kern AD, Bejerano G et al. (2007) Human genome ultraconserved elements are ultraselected. Science 317: 915.

Kioussis D (2005) Gene regulation: kissing chromosomes. Nature 435: 579–580.

Kondrashov AS and Shabalina SA (2002) Classification of common conserved sequences in mammalian intergenic regions. Human Molecular Genetics 11: 669–674.

Kryukov GV, Schmidt S and Sunyaev S (2005) Small fitness effect of mutations in highly conserved non‐coding regions. Human Molecular Genetics 14: 2221–2229.

Lindblad‐Toh K, Wade CM, Mikkelsen TS et al. (2005) Genome sequence, comparative analysis and haplotype structure of the domestic dog. Nature 438: 803–819.

Loots GG, Locksley RM, Blankespoor CM et al. (2000) Identification of a coordinate regulator of interleukins 4, 13, and 5 by cross‐species sequence comparisons. Science 288: 136–140.

Margulies EH, Blanchette M, Haussler D and Green ED (2003) Identification and characterization of multi‐species conserved sequences. Genome Research 13: 2507–2518.

Margulies EH, Cooper GM, Asimenos G et al. (2007) Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome. Genome Research 17: 760–774.

Meisler MH (2001) Evolutionarily conserved noncoding DNA in the human genome: how much and what for? Genome Research 11: 1617–1618.

Mikkelsen TS, Wakefield MJ, Aken B et al. (2007) Genome of the marsupial Monodelphis domestica reveals innovation in non‐coding sequences. Nature 447: 167–177.

Nielsen JA, Hudson LD and Armstrong RC (2002) Nuclear organization in differentiating oligodendrocytes. Journal of Cell Science 115: 4071–4079.

Nobrega MA, Ovcharenko I, Afzal V and Rubin EM (2003) Scanning human gene deserts for long‐range enhancers. Science 302: 413.

Nobrega MA, Zhu Y, Plajzer‐Frick I, Afzal V and Rubin EM (2004) Megabase deletions of gene deserts result in viable mice. Nature 431: 988–993.

Pennacchio LA, Ahituv N, Moses AM et al. (2006) In vivo enhancer analysis of human conserved non‐coding sequences. Nature 444: 499–502.

Schwartz S, Elnitski L, Li M et al. (2003a) MultiPipMaker and supporting tools: alignments and analysis of multiple genomic DNA sequences. Nucleic Acids Research 31: 3518–3524.

Schwartz S, Kent WJ, Smit A et al. (2003b) Human–mouse alignments with BLASTZ. Genome Research 13: 103–107.

Siepel A and Haussler D (2004) Phylogenetic estimation of context‐dependent substitution rates by maximum likelihood. Molecular Biology and Evolution 21: 468–488.

Spilianakis CG, Lalioti MD, Town T, Lee GR and Flavell RA (2005) Interchromosomal associations between alternatively expressed loci. Nature 435: 637–645.

Spitz F, Gonzalez F and Duboule D (2003) A global control region defines a chromosomal regulatory landscape containing the HoxD cluster. Cell 113: 405–417.

Stone JR and Wray GA (2001) Rapid evolution of cis‐regulatory sequences via local point mutations. Molecular Biology and Evolution 18: 1764–1770.

Stormo GD (2000) Identification of coordinated gene expression and regulatory sequences. Pacific Symposium on Biocomputing 416–417.

Stranger BE, Forrest MS, Dunning M et al. (2007a) Relative impact of nucleotide and copy number variation on gene expression phenotypes. Science 315: 848–853.

Stranger BE, Nica AC, Forrest MS et al. (2007b) Population genomics of human gene expression. Nature Genetics 39: 1217–1224.

Thomas JW, Touchman JW, Blakesley RW et al. (2003) Comparative analyses of multi‐species sequences from targeted genomic regions. Nature 424: 788–793.

Waterston RH, Lindblad‐Toh K, Birney E et al. (2002) Initial sequencing and comparative analysis of the mouse genome. Nature 420: 520–562.

Woolfe A, Goodson M, Goode DK et al. (2005) Highly conserved non‐coding sequences are associated with vertebrate development. PLoS Biology 3: e7.

Xie X, Lu J, Kulbokas EJ et al. (2005) Systematic discovery of regulatory motifs in human promoters and 3′ UTRs by comparison of several mammals. Nature 434: 338–345.

Xie X, Mikkelsen TS, Gnirke A et al. (2007) Systematic discovery of regulatory motifs in conserved regions of the human genome, including thousands of CTCF insulator sites. Proceedings of the National Academy of Sciences of the USA 104: 7145–7150.

Further Reading

Asthana S, Roytberg M, Stamatoyannopoulos J and Sunyaev S (2007) Analysis of sequence conservation at nucleotide resolution. PLoS Computational Biology 3(12): e254.

Ureta‐Vidal A, Ettwiller L and Birney E (2003) Comparative genomics: genome‐wide analysis in metazoan eukaryotes. Nature Reviews Genetics 4(4): 251–262.

Visel A, Prabhakar S, Akiyama JA et al. (2008) Ultraconservation identifies a small subset of extremely constrained developmental enhancers. Nature Reviews Genetics 40(2): 158–160.

Wray GA (2007) The evolutionary significance of cis‐regulatory mutations. Nature Reviews Genetics 8(3): 206–216.

Contact Editor close
Submit a note to the editor about this article by filling in the form below.

* Required Field

How to Cite close
Dermitzakis, Emmanouil T(Jul 2008) The Biological Significance of Conserved Nongenic DNA. In: eLS. John Wiley & Sons Ltd, Chichester. http://www.els.net [doi: 10.1002/9780470015902.a0020828]