Smith–Waterman Algorithm

Abstract

The Smith–Waterman algorithm is a computer algorithm that finds regions of local similarity between DNA or protein sequences.

Keywords: DNA; protein; sequence alignment

Figure 1.

Local alignment of the DNA sequences TTACCGGCCAACTAA, ACCGTGTCACTAAC. Aligned portions are shown in upper case and unaligned portions of the sequences are shown in lower case.

Figure 2.

Global alignment for the example shown in Figure .

Figure 3.

Dot matrix for the optimal local alignment between sequences, showing cells in which the corresponding row and column letters match; cells lying on the optimal alignment that match; and where the path through the cell is a mismatch or gap.

close

References

Arslan AN, Egecioglu O and Pevzner PA (2001) A new approach to sequence comparison: normalized sequence alignment. Bioinformatics 17: 327–337.

Batzoglou S, Pachter L, Mesirov JP, Berger B and Lander ES (2000) Human and mouse gene structure: comparative analysis and application to exon prediction. Genome Research 10: 950–958.

Durbin R, Krogh A, Michison G and Eddy S (1999) Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge, UK: Cambridge University Press.

Florea L, Hartzell G, Zhang Z, Rubin GM and Miller W (1998) A computer program for aligning a cDNA sequence with a genomic DNA sequence. Genome Research 8: 967–974.

Gotoh O (1982) An improved algorithm for matching biological sequences. Journal of Molecular Biology 162: 705–708.

Mott R (1997) EST_GENOME: a program to align spliced DNA sequences to unspliced genomic DNA. Computer Applications in the Biosciences: CABIOS 13: 477–478.

Mott R (1999) Local sequence alignments with monotonic gap penalties. Bioinformatics 15: 455–462.

Needleman SB and Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequences of two proteins. Journal of Molecular Biology 48: 444–453.

Sellers P (1974) An algorithm for the distance between two finite sequences. Combinatorial Theory 16: 253–258.

Smith TF and Waterman MSW (1981) Identification of common molecular subsequences. Journal of Molecular Biology 147: 195–197.

Further Reading

Waterman MS (1995) Introduction to Computational Biology Maps, Sequences and Genomes. Boca Raton, FL: CRC Press.

Web Links

Pfam (protein families database of alignments and HMMs). Updated May 2002 http://www.sanger.ac.uk/Pfam

SMART (simple modular architecture research tool). Updated May 2002 http://smart.embl‐heidelberg.de

Contact Editor close
Submit a note to the editor about this article by filling in the form below.

* Required Field

How to Cite close
Mott, Richard(Sep 2005) Smith–Waterman Algorithm. In: eLS. John Wiley & Sons Ltd, Chichester. http://www.els.net [doi: 10.1038/npg.els.0005263]