Gene Feature Identification
Izabela Makałowska, The Pennsylvania State University, University Park, Pennsylvania, USA
Published online: July 2007
DOI: 10.1002/9780470015902.a0005319.pub2
Abstract
The identification of all genes is one of the goals of any genome‐sequencing project. Apart from laboratory techniques, genes
can also be identified by using computational, homology‐based or ab initio (model‐based) methods, which differ in their performance according to the sequence being analysed. With the growing number
of sequenced genomes, comparative genomics is becoming the most powerful method for deciphering genomes.
Keywords: gene prediction; comparative genomics; gene structure; sequence similarity
References
Brent MR
(2005)
Genome annotation past, present, and future: how to define an ORF at each locus.
Genome Research
15: 1777–1786.
Claverie JM
(1997)
Computational methods for the identification of genes in vertebrate genomic sequences.
Human Molecular Genetics
6: 1735–1744.
Curwen V,
Eyras E,
Andrews TD
et al. (2004)
The Ensembl automatic gene annotation system.
Genome Research
14: 942–950.
Eddy SR
(2002)
Computational genomics of noncoding RNA genes.
Cell
109: 137–140.
Guigo R,
Agarwal P,
Abril JF,
Burset M and Fickett JW
(2000)
An assessment of gene prediction accuracy in large DNA sequences.
Genome Research
10: 1631–1642.
Imanishi T,
Itoh T,
Suzuki Y
et al. (2004)
Integrative annotation of 21,037 human genes validated by full‐length cDNA clones.
PLoS Biology
2: e162.
Makalowska I,
Sood R,
Faruque MU
et al. (2002)
Identification of six novel genes by experimental validation of GeneMachine predicted genes.
Gene
284: 203–213.
Mount SM,
Gotea V,
Lin CF,
Hernandez K and Makalowski W
(2007)
Spliceosomal small nuclear RNA genes in 11 insect genomes.
RNA
13(1): 5–14.
Richards S,
Liu Y,
Bettencourt BR
et al. (2005)
Comparative genome sequencing of Drosophila pseudoobscura: chromosomal, gene, and cis‐element evolution.
Genome Research
15: 1–18.
Further Reading
Carter D and Durbin R
(2006)
Vertebrate gene finding from multiple‐species alignments using a two‐level strategy.
Genome Biology
7 (Suppl 1): S61–S612.
Do JH and Choi DK
(2006)
Computational approaches to gene prediction.
Journal of Microbiology
44: 137–144.
Guigo R,
Flicek P,
Abril JF
et al. (2006)
EGASP: the human ENCODE Genome Annotation Assessment Project.
Genome Biology
7 (Suppl 1): S21–S231.
Jones SJ
(2006)
Prediction of Genomic Functional Elements.
Annual Review of Genomics and Human Genetics
7: 315–338.
Mount DW
(2001)
Gene prediction. Bioinformatics.
Sequence and Genome Analysis,
pp. 337–380.
Cold Spring Harbor, NY:
Cold Spring Harbor Laboratory Press.
Reese MG,
Hartzell G,
Harris NL
et al. (2000)
Genome annotation assessment in Drosophila melanogaster.
Genome Research
10: 483–501.
Stormo GD
(2000)
Gene‐finding approaches for eucariotes.
GenomeResearch
10: 394–397.
Windsor AJ and Mitchell‐Olds T
(2006)
Comparative genomics as a tool for gene discovery.
Current Opinions in Biotechnology
17: 161–167.
Zhang MQ
(2002)
Computational prediction of eukaryotic protein‐coding genes.
Nature Reviews Genetics
3: 698–709.