Genome Databases

A genome comprises all the genetic material of an organism and contains all the instructions needed to direct its activities. Since a genome also provides a natural reference frame for mapping information about genes and proteins, genome databases, the official repositories of the data produced by the genomic initiatives, are rapidly becoming the primary entry point for accessing biological knowledge. They are continuously enriched with new information and with functional annotations about the genome products and are meant to serve many diverse communities, from biologists to geneticists, from clinical researchers to pharmacologists and more. This implies that a major effort is required for solving the complex issue of visualising the genomic and functional data in an integrated fashion whereas at the same time providing different views of the data, tailored to the many diverse user needs.

Key Concepts:

  • Genome databases are the official repositories of the ever growing amount of genomic sequences.
  • The genome represents a natural framework for mapping the biological data of an organism.
  • Genome browsers provide integrated and customizable views of the information.
  • Genome databases and their associated tools are becoming the primary entry points for accessing biological information.
  • As more data on individual genomes becomes available, genome databases are also the repositories of variation data and of their associated phenotype.
  • Future challenges are not only related to the sheer size of the data, but also to the need of protecting sensitive information without hampering the exploitation of the data for new discoveries.

Keywords: genomics; bioinformatics; genome databases; annotation; distributed annotation system (DAS)

Figure 1. Gaining information on a gene of interest using a Genome Browser (the UCSD genome browser in this example). The user selects a gene of interest, in this case myoglobin, and is directed to the genomic region encoding for the corresponding gene (a). Clicking on the gene name causes a description of what is known about the gene to be displayed (b). It is possible to retrieve mRNA expression data (c), the three-dimensional structure (d) and functional information (e) about the encoded protein. Similar tools are provided by other genome browsers.
Figure 2. Example of exploring genomes to gain information on variations.
close
 References
    Church GM (2006) Genomes for all. Scientific American 294: 46–54.
    Flicek P, Aken BL, Ballester B et al. (2010) Ensembl's 10th year. Nucleic Acids Research 38: D557–D562.
    Hall N (2007) Advanced sequencing technologies and their wider impact in microbiology. Journal of Experimental Biology 210: 1518–1525.
    International Human Genome Sequencing Consortium (2004) Finishing the euchromatic sequence of the human genome. Nature 431: 931–945.
    Lander ES, Linton LM, Birren B et al. (2001) Initial sequencing and analysis of the human genome. Nature 409: 860–921.
    Liolios K, Chen IM, Mavromatis K et al. (2010) The Genomes On Line Database (GOLD) in 2009: status of genomic and metagenomic projects and their associated metadata. Nucleic Acids Research 38: D346–D354.
    Pertea M and Salzberg SL (2010) Between a chicken and a grape: estimating the number of human genes. Genome Biology 11: 206.
    Prlic A, Down TA, Kulesha E et al. (2007) Integrating sequence and structural biology with DAS. BMC Bioinformatics 8: 333.
    Pruitt KD, Tatusova T, Klimke W et al. (2009) NCBI Reference Sequences: current status, policy and new initiatives. Nucleic Acids Research 37: D32–D36.
    Rhead B, Karolchik D, Kuhn RM et al. (2010) The UCSC Genome Browser database: update 2010. Nucleic Acids Research 38: D613–D619.
    Sterk P, Kulikova T, Kersey P et al. (2007) The EMBL nucleotide sequence and genome reviews databases. Methods in Molecular Biology 406: 1–21.
    Venter JC, Adams MD, Myers EW et al. (2001) The sequence of the human genome. Science 291: 1304–1351.
 Further Reading
    Altman RB (2004) Building successful biological databases. Briefings in Bioinformatics 5(1): 4–5.
    other Each January issue of Nucleic Acids Research is a special database issue.
    book Lesk AM (2007) Introduction to Genomics. Oxford: Oxford University Press.
    book Schattner P (2008) Genomes, Browsers and Databases. Cambridge: Cambridge University Press.
Contact Editor close
Submit a note to the editor about this article by filling in the form below.

* Required Field

How to Cite close
Tramontano, Anna(May 2011) Genome Databases. In: eLS. John Wiley & Sons Ltd, Chichester. http://www.els.net [doi: 10.1002/9780470015902.a0005314.pub2]