Genetic Databases: Mining

Abstract

The exhausting pace at which new genomic sequences are deposited with the databanks makes it essential to put efficient retrieval systems at the scientific community's disposal. Entrez and SRS are two such programs, freely accessible from the Web, that enable powerful simple and Boolean queries.

Keywords: sequence; databases; retrieval; entrez; SRS

Figure 1.

Screenshot of an Entrez query showing how to use the ‘Preview’ item in order to add terms in a query and perform Boolean searches.

Figure 2.

Screenshot of a Sequence Retrieval System (SRS) ‘simple query’: here all the ABC transporter sequences from human and mouse in the European Molecular Biology Laboratory (EMBL) databank were requested.

Figure 3.

Screenshot of an SRS ‘extended query’, showing that various items can be selected from the Feature table.

close

References

Apweiler R, Attwood TK, Bairoch A, et al. (2001) The InterPro database, an integrated documentation resource for protein families, domains and functional sites. Nucleic Acids Research 29: 37–40.

Baxevanis AD (2001a) The Molecular Biology Database Collection: an updated compilation of biological database resources. Nucleic Acids Research 29: 1–10.

Baxevanis AD (2001b) Information retrieval from biological databases. In: Baxevanis AD and Ouelette BF (eds.) Bioinformatics, a Practical Guide to the Analysis of Genes and Proteins, pp. 155–185. New York, NY: John Wiley & Sons.

Bhat TN, Bourne P, Feng Z, et al. (2001) The PDB data uniformity project. Nucleic Acids Research 29: 214–218.

Corpet F, Servant F, Gouzy J and Kahn D (2000) ProDom and ProDom‐CG: tools for protein domain analysis and whole genome comparisons. Nucleic Acids Research 28: 267–269.

Etzold T, Ulyanov A and Argos P (1996) SRS: information retrieval system for molecular biology data banks. Methods in Enzymology 266: 114–128.

Gouy M, Gautier C, Attimonelli M, Lanave C and di Paola G (1985) ACNUC – a portable retrieval system for nucleic acid sequence databases: logical and physical designs and usage. Computer Applications in the Biosciences 1: 167–172.

Hofmann K, Bucher P, Falquet L and Bairoch A (1999) The PROSITE database, its status in 1999. Nucleic Acids Research 27: 215–219.

Schuler GD, Epstein JA, Ohkawa H and Kans JA (1996) Entrez: molecular biology database and retrieval system. Methods in Enzymology 266: 141–162.

Wheeler DL, Church DM, Lash AE, et al. (2001) Database resources of the National Center for Biotechnology Information. Nucleic Acids Research 29: 11–16.

Further Reading

Database Issue. Nucleic Acids Research. [The first issue each January is a special volume on databanks.]

Putnam NC (1998) Searching MEDLINE free on the Internet using the National Library of Medicine's PubMed. Clinical Excellence for Nurse Practitioners 2: 314–316.

Samson C (2000) Database searching with DNA and protein sequences: an introduction. Briefings in Bioinformatics 1: 22–32.

The Gene Ontology Consortium (2000) Gene ontology: tool for the unification of biology. Nature Genetics 25: 25–29.

The Gene Ontology Consortium (2001) Creating the gene ontology resource: design and implementation. Genome Research 11: 1425–1433.

Web Links

European Bioinformatics Institute (EBI) http://www.ebi.ac.uk

Entrez the query system at NCBI http://www.ncbi.nlm.nih.gov/Entrez

International Nucleotide Sequence Database (INSD) http://www.ncbi.nlm.nih.gov/collab

SRS the query system at EBI http://srs.ebi.ac.uk

Contact Editor close
Submit a note to the editor about this article by filling in the form below.

* Required Field

How to Cite close
Risler, Jean‐Loup, and Louis, Alexandra(Sep 2005) Genetic Databases: Mining. In: eLS. John Wiley & Sons Ltd, Chichester. http://www.els.net [doi: 10.1038/npg.els.0005313]