Gene Feature Identification


The identification of all genes is one of the goals of any genome‐sequencing project. Apart from laboratory techniques, genes can also be identified by using computational, homology‐based or ab initio (model‐based) methods, which differ in their performance according to the sequence being analysed. With the growing number of sequenced genomes, comparative genomics is becoming the most powerful method for deciphering genomes.

Keywords: gene prediction; comparative genomics; gene structure; sequence similarity

Figure 1.

Fragment of the xenopus genome annotated using Ensembl pipeline. Ensembl genes are shown together with Genscan and FGENESH predictions, EST genes and UniProt and UniGene alignments.



Brent MR (2005) Genome annotation past, present, and future: how to define an ORF at each locus. Genome Research 15: 1777–1786.

Claverie JM (1997) Computational methods for the identification of genes in vertebrate genomic sequences. Human Molecular Genetics 6: 1735–1744.

Curwen V, Eyras E, Andrews TD et al. (2004) The Ensembl automatic gene annotation system. Genome Research 14: 942–950.

Eddy SR (2002) Computational genomics of noncoding RNA genes. Cell 109: 137–140.

Guigo R, Agarwal P, Abril JF, Burset M and Fickett JW (2000) An assessment of gene prediction accuracy in large DNA sequences. Genome Research 10: 1631–1642.

Imanishi T, Itoh T, Suzuki Y et al. (2004) Integrative annotation of 21,037 human genes validated by full‐length cDNA clones. PLoS Biology 2: e162.

Makalowska I, Sood R, Faruque MU et al. (2002) Identification of six novel genes by experimental validation of GeneMachine predicted genes. Gene 284: 203–213.

Mount SM, Gotea V, Lin CF, Hernandez K and Makalowski W (2007) Spliceosomal small nuclear RNA genes in 11 insect genomes. RNA 13(1): 5–14.

Richards S, Liu Y, Bettencourt BR et al. (2005) Comparative genome sequencing of Drosophila pseudoobscura: chromosomal, gene, and cis‐element evolution. Genome Research 15: 1–18.

Further Reading

Carter D and Durbin R (2006) Vertebrate gene finding from multiple‐species alignments using a two‐level strategy. Genome Biology 7 (Suppl 1): S61–S612.

Do JH and Choi DK (2006) Computational approaches to gene prediction. Journal of Microbiology 44: 137–144.

Guigo R, Flicek P, Abril JF et al. (2006) EGASP: the human ENCODE Genome Annotation Assessment Project. Genome Biology 7 (Suppl 1): S21–S231.

Jones SJ (2006) Prediction of Genomic Functional Elements. Annual Review of Genomics and Human Genetics 7: 315–338.

Mount DW (2001) Gene prediction. Bioinformatics. Sequence and Genome Analysis, pp. 337–380. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press.

Reese MG, Hartzell G, Harris NL et al. (2000) Genome annotation assessment in Drosophila melanogaster. Genome Research 10: 483–501.

Stormo GD (2000) Gene‐finding approaches for eucariotes. GenomeResearch 10: 394–397.

Windsor AJ and Mitchell‐Olds T (2006) Comparative genomics as a tool for gene discovery. Current Opinions in Biotechnology 17: 161–167.

Zhang MQ (2002) Computational prediction of eukaryotic protein‐coding genes. Nature Reviews Genetics 3: 698–709.

Contact Editor close
Submit a note to the editor about this article by filling in the form below.

* Required Field

How to Cite close
Makałowska, Izabela(Jul 2007) Gene Feature Identification. In: eLS. John Wiley & Sons Ltd, Chichester. [doi: 10.1002/9780470015902.a0005319.pub2]