The Contribution of Frameshift Translation to the Generation of Novel Human Proteins


The creation of novel proteins is not simply ascribed to duplication of homologous sequences, but can be largely explained by frameshift translation. A deficiency of the TpA dinucleotide in protein‐coding deoxyribonucleic acid (DNA) sequences renders them tolerant of frameshift mutations by minimizing the opportunity for premature stop codons, and involvement of both strands can increase genomic complexity. This supports the suggestion that new coding sequences evolve from existing or ancestral exons rather than from nonexonic sequences.

Keywords: protein‐coding sequence; frameshift; TpA dinucleotide; termination codon; intrastrand parity in DNA

Figure 1.

Designation of the six reading frames, viz. RF0, RF1, RF2, RF3, RF4 and RF5. Although peptide sequences of RF0 are the actual gene products, others are virtual sequences translated in silico to search for frameshift translation. Termination codons in the virtual sequences are transliterated to ‘X’ for subsequent BLASTP analysis. Note that RF5 is the opposite frame that reads the same three‐nucleotide position as the original frame and that RF0 and RF3 share the first two‐nucleotide position.

Figure 2.

Classification of duplications followed by frameshift events that have led to structural divergence. In all depictions, hatched marks highlight protein‐coding regions that have undergone a frameshift creating diverged protein sequences. Regions of homology within the same frame are marked in grey, while unique sequence is marked in black. Arrows indicate transcription orientation. (a) Tandem duplications. (b) Large gene families arising through multiple duplications. (c) Interspersed duplications. (d) Retrotranspositions. (e) Sense overlapping transcripts. (f) Antisense overlapping transcripts. (g) Internal frameshifts. (h) Alternatively spliced variants.



Hahn Y and Lee B (2005) Identification of nine human‐specific frameshift mutations by comparative analysis of the human and the chimpanzee genome sequences. Bioinformatics 21: i186–i194.

Long M, Betran E, Thornton K and Wang W (2003) The origin of new genes: glimpses from the young and old. Nature Reviews Genetics 4: 865–875.

Ohno S (1984) Birth of a unique enzyme from an alternative reading frame of the preexisted, internally repetitious coding sequence. Proceedings of the National Academy of Sciences of the USA 81: 2421–2425.

Ohno S (1988) Universal rule for coding sequence construction: TA/CG deficiency–TG/CT excess. Proceedings of the National Academy of Sciences of the USA 85: 9630–9634.

Ohno S and Epplen JT (1983) The primitive code and repeats of base oligomers as the primordial protein‐encoding sequence. Proceedings of the National Academy of Sciences of the USA 80: 3391–3395.

Okamura K, Wei J and Scherer SW (2007) Evolutionary implications of inversions that have caused intra‐strand parity in DNA. BMC Genomics 8: 160.

Raes J and Van de Peer Y (2005) Functional divergence of proteins through frameshift mutations. Trends in Genetics 21: 428–431.

Shiba K, Takahashi Y and Noda T (2002) On the role of periodism in the origin of proteins. Journal of Molecular Biology 320: 833–840.

Sun J, Chen M, Xu J and Luo J (2005) Relationships among stop codon usage bias, its context, isochores, and gene expression level in various eukaryotes. Journal of Molecular Evolution 61: 437–444.

Yomo T and Ohno S (1989) Concordant evolution of coding and noncoding regions of DNA made possible by the universal rule of TA/CG deficiency‐TG/CT excess. Proceedings of the National Academy of Sciences of the USA 86: 8452–8456.

Further Reading

Amrani N, Sachs MS and Jacobson A (2006) Early nonsense: mRNA decay solves a translational problem. Nature Reviews Molecular Cell Biology 7: 415–425.

Baisnée PF, Hampson S and Baldi P (2002) Why are complementary DNA strands symmetric? Bioinformatics 18: 1021–1033.

Forsdyke DR and Mortimer JR (2000) Chargaff's legacy. Gene 261: 127–137.

Gesterland RF and Atkins JF (1996) Recoding: dynamic reprogramming of translation. Annual Review of Biochemistry 65: 741–768.

Kashiwagi K, Isogai Y, Nishiguchi K and Shiba K (2006) Frame shuffling: a novel method for in vitro protein evolution. Protein Engineering, Design & Selection 19: 135–140.

Ohno S (1970) Evolution by Gene Duplication. New York: Springer.

Okamura K, Feuk L, Marquès‐Bonet T, Navarro A and Scherer SW (2006) Frequent appearance of novel protein‐coding sequences by frameshift translation. Genomics 88: 690–697.

Osawa S, Jukes TH, Watanabe K and Muto A (1992) Recent evidence for evolution of the genetic code. Microbiological Reviews 56: 229–264.

Ueda M, Fujimoto M, Arimura S, Tsutsumi N and Kadowaki K (2006) Evidence for transit peptide acquisition through duplication and subsequent frameshift mutation of a preexisting protein gene in rice. Molecular Biology and Evolution 23: 2405–2412.

Veeramachaneni V, Makalowski W, Galdzicki M, Sood R and Makalowska I (2004) Mammalian overlapping genes: the comparative perspective. Genome Research 14: 280–286.

Contact Editor close
Submit a note to the editor about this article by filling in the form below.

* Required Field

How to Cite close
Okamura, Kohji, and Scherer, Stephen W(Dec 2007) The Contribution of Frameshift Translation to the Generation of Novel Human Proteins. In: eLS. John Wiley & Sons Ltd, Chichester. [doi: 10.1002/9780470015902.a0020792]