Sequence Accuracy and Verification

Abstract

Analyses involving deoxyribonucleic acid sequences have to consider three main parameters concerning accuracy: sequence quality, sequence contiguity and sequence fidelity. Here, sequence quality defines the probability of error for any baseā€call, contiguity defines the completeness and correctness of the assembly of subsequences and fidelity defines the correctness of the genomic representation of the assembly.

Keywords: DNA sequence; assembly; contiguity; fidelity; quality

Figure 1.

The PHRAP quality scores of a typical human genome ‘draft’ sequence as available from the EMBL database.

Figure 2.

Levels of sequence contiguity. (N)100 indicates sequence gap in the clone assembly, (N)50000 indicates a bridged sequence gap in the chromosome assembly and (N)100000 indicates an unbridged sequence gap in the chromosome assembly. S indicates switch points between clone sequences in the chromosome assembly. Switch points are chosen arbitrarily within the middle sections of overlapping clone sequences.

Figure 3.

EMBL/GENBANK/DDBJ database entry for the draft sequence shown in Figure .

close

References

Abola EE, Bairoch A, Barker WC, et al. (2000) Quality control in databanks for molecular biology. BioEssays 22: 1024–1034.

Beck S (1993) Accuracy of DNA sequencing: should the sequence quality be monitored? DNA Sequence 4: 215–217.

Bonfield JK, Smith KF and Staden R (1995) A new DNA sequence assembly program. Nucleic Acids Research 23: 4992–4999.

Burge C and Karlin S (1997) Prediction of complete gene structures in human genomic DNA. Journal of Molecular Biology 268: 78–94.

Ewing B and Green P (1998) Base‐calling of automated sequencer traces using phred II. Error probabilities. Genome Research 8: 186–194.

Ewing B, Hillier L, Wendl MC and Green P (1998) Base‐calling of automated sequencer traces using phred I. Genome Research 8: 175–185.

Felsenfeld A, Peterson J, Schloss J and Guyer M (1999) Assessing the quality of the DNA sequence from the Human Genome Project. Genome Research 9: 1–4.

Gordon D, Abajian C and Green P (1998) Consed: a graphical tool for sequence finishing. Genome Research 8: 195–202.

Yeh RF, Lim LP and Burge CB (2001) Computational inference of homologous gene structures in the human genome. Genome Research 11: 803–816.

Further Reading

Altschul SF, Gish W, Miller W, Myers EW and Lipman DJ (1990) Basic local alignment search tool. Journal of Molecular Biology 215: 403–410.

Dunham I, Shimizu N, Roe BA, et al. (1999) The DNA sequence of human chromosome 22. Nature 402: 489–495.

Gregory SG, Howell GR and Bentley DR (1997) Genome mapping by fluorescent fingerprinting. Genome Research 7: 1162–1168.

Hattori M, Fujiyama A, Taylor TD, et al. (2000) The DNA sequence of human chromosome 21. Nature 405: 311–319.

Marra MA, Kucaba TA, Dietrich NL, et al. (1997) High throughput fingerprint analysis of large‐insert clones. Genome Research 7: 1072–1084.

Mullikin JC, Hunt SE, Cole CG, et al. (2000) An SNP map of human chromosome 22. Nature 407: 516–520.

Osoegawa K, Mammoser AG, Wu C, et al. (2001) A bacterial artificial chromosome library for sequencing the complete human genome. Genome Research 11: 483–496.

International Human Genome Sequencing Consortium (2001) Initial sequencing and analysis of the human genome. Nature 409: 860–921.

Web Links

Ensembl Trace Server http://trace.ensembl.org/

Genome Sequencing Center. International Finishing Standards for the Human Genome Project (Version September 7, 2001) http://genome.wustl.edu/gsc/Overview/finrules/hgfinrules.html

National Human Genome Research Institute (NHGRI). NHGRI Standard for Quality of Human Genomic Sequence http://www.nhgri.nih.gov:80/Grant_info/Funding/Statements/RFA/quality_standard.html

National Center for Biotechnology Information: NCBI News http://www.ncbi.nlm.nih.gov/Web/Newsltr/feb98.html#GenBank

Project Ensembl. Ensembl Genome Browser http://www.ensembl.org/

Summary of the Report of the Second International Strategy Meeting on Human Genome Sequencing Bermuda, 27th February–2nd March 1997 sponsored by the Wellcome Trust http://www.gene.ucl.ac.uk/hugo/bermuda2.htm

The Phred/Phrap/Consed System home page http://www.phrap.org

The Wellcome Trust Sanger Institute Human Blast Server http://www.sanger.ac.uk/HGP/blast_server.shtml

The Wellcome Trust Sanger Institute: software http://www.sanger.ac.uk/Software/

UCSC Human Genome Project Working Draft http://genome.ucsc.edu/

Contact Editor close
Submit a note to the editor about this article by filling in the form below.

* Required Field

How to Cite close
Beck, Stephan(Sep 2005) Sequence Accuracy and Verification. In: eLS. John Wiley & Sons Ltd, Chichester. http://www.els.net [doi: 10.1038/npg.els.0005390]