Gene Synthesis for Protein Production


Synthetic gene production is an enabling technology for improved protein production since it can be used to produce totally novel gene sequences that are optimized for codon usage and other sequence features anticipated to facilitate improved protein expression in defined expression systems. Computer aided gene design algorithms of increasing sophistication, allow researchers to exploit the degeneracy of the genetic code to engineer expression optimized synthetic gene sequences and the overlapping oligonucleotides needed to manufacture the desired synthetic gene by polymerase chain reaction (PCR) methods. Low cost synthetic gene production is ushering in a new era of synthetic biology for improved protein production.

Keywords: gene synthesis; codon usage; genetic code; protein production; gene design

Figure 1.

A ribosome in translation and glycine aminoacyl‐tRNAs in E. coli. An mRNA sequence is shown with a single ribosome particle engaged in the process of translation. In this snapshot, the growing polypeptide chain is attached to its C‐terminal tRNA located in the P‐site adjacent to the next aminoacyl‐tRNA located in the A‐site. The K12 strain of E. coli has six different genes encoding glycine‐tRNAs, illustrated here in their aminoacylated form with the anticodon sequence shown paired with the respective glycine codon that is recognized by each tRNA. There are four genes in E. coli that encode copies of the Gly tRNA with a 3′‐CCG‐5′ anticodon loop coloured green (the number of genes is indicated as a superscript number above the tRNAs shown). The Gly tRNAs with 3′‐CCG‐5′ anticodons can read both the GGC (pink) and GGU (gold) mRNA codons, since the G in the wobble position of the anticodon loop is capable of reading both the C or U in the complementary position within the codon (star). The frequency with which the four different glycine codons are found in the ORFs of E. coli are listed below each of the codons for all genomic ORFs, or for only the ORFs of highly expressed proteins (HX). Website sources of information used in this figure are located at: and

Figure 2.

Effects of synonymous sequence differences on protein production. Top Panel: Effect of synonymous codon usage on the kinetics of protein production. Two synonymous mRNAs are shown with multiple ribosome complexes in the elongation phase of translation. One mRNA contains two rarely used (GGA and GGG) codons for glycine (top panel), while the other mRNA contains two commonly used (GGU and GGC) codons for glycine (bottom panel). The differential codon usage is anticipated to affect the kinetics of translation such that the mRNA with preferred codon usage results in a higher overall steady state protein production level. Lower Panel: Effect of synonymous codon usage on protein function. Two synonymous mRNAs are shown with multiple ribosome complexes in the elongation phase of translation. Synonymous changes for Gly and Ile codons do not affect the overall level of protein production for the entire mRNA, but do affect the final ligand specificity of the resulting protein. In this case, differential codon usage is anticipated to affect the specific kinetics of translation at crucial points in the folding of the protein, resulting in a different folding pattern and ligand specificity of the final protein product (Kimchi‐Sarfaty et al., ).

Figure 3.

Engineering synonymous ‘silent’ changes in gene sequences for improved protein production. Top Panel: One possible DNA sequence encoding an expressed protein is shown as the top sequence. The bottom sequence is an alternative DNA sequence with synonymous codon sequence changes (lower case) that achieve the desired engineering features in either the DNA (restriction site removal or introduction) or the RNA (RNase removal, ACA removal, cryptic Shine–Dalgarno removal, ambush stop codon introduction and hairpin removal). DNA/RNA elements removed from the gene are highlighted in red. Sequence changes that result in the introduction of a new sequence element are shown in green, while changes that eliminate sequence elements are shown in blue. Bottom Panel: An mRNA is shown with multiple ribosome complexes in the elongation phase of translation along the mRNA. When the ribosome translates through a ‘slippery’ nucleotide repeat sequence it has an increased propensity to shift its reading frame to the –1 frame where it encodes a mistranslated amino acid sequence at its C‐terminus until it reaches a stop codon in the –1 frame. Engineering of silent/synonymous changes to the gene sequence can eliminate the slippery nucleotide repeat sequence and also introduce a stop codon in the –1 reading frame that would be encountered by a frameshifted ribosome sooner than would otherwise happen in the native RNA sequence. The combined effect of eliminating the ‘slippery’ repeat site and introduction of an ‘ambush’ stop signal should improve overall normal protein production.

Figure 4.

Gene synthesis by PCR from overlapping oligonucleotides in combination with mismatch specific endonuclease elimination of mutant strands. This figure illustrates a generic PCR‐based gene synthesis protocol starting with overlapping complementary oligonucleotides and following the steps described in the figure itself and in the body text of this article.



Ban N, Nissen P, Hansen J et al. (2000) The complete atomic structure of the large ribosomal subunit at 2.4 A resolution. Science 289: 905–920.

Boycheva S, Chkodrov G and Ivanov I (2003) Codon pairs in the genome of Escherichia coli. Bioinformatics 19: 987–998.

Bregeon D, Colot V, Radman M and Taddei F (2001) Translational misreading: a tRNA modification counteracts a +2 ribosomal frameshift. Genes Development 15: 2295–2306.

Carr‐Schmid A and Kinzy TG (2001) Messenger RNA: interaction with ribosomes. Encyclopedia of Life Sciences 1–6.

Caruthers MH, Barone AD, Beaucage SL et al. (1987) Chemical synthesis of deoxyoligonucleotides by the phosphoramidite method. Methods Enzymology 154: 287–313.

Crick FH (1966) Codon–anticodon pairing: the wobble hypothesis. Journal of Molecular Biology 19: 548–555.

Dillon PJ and Rosen CA (1990) A rapid method for the construction of synthetic genes using the polymerase chain reaction. Biotechniques 9: 298–300.

Fiers W, Contreras R, Duerinck F et al. (1976) Complete nucleotide sequence of bacteriophage MS2 RNA: primary and secondary structure of the replicase gene. Nature 260: 500–507.

Gao X, Yo P, Keith A et al. (2003) Thermodynamically balanced inside‐out (TBIO) PCR‐based gene synthesis: a novel method of primer design for high‐fidelity assembly of longer gene sequences. Nucleic Acids Research 31: e143.

Gupta NK, Ohtsuka E, Sgaramella V et al. (1968) Studies on polynucleotides, 88. Enzymatic joining of chemically synthesized segments corresponding to the gene for alanine‐tRNA. Proceedings of the National Academy of Sciences of the USA 60: 1338–1344.

Hofacker IL (2003) Vienna RNA secondary structure server. Nucleic Acids Research 31: 3429–3431.

Hoover DM and Lubkowski J (2002) DNA works: an automated method for designing oligonucleotides for PCR‐based gene synthesis. Nucleic Acids Research 30: e43.

Inouye M (2006) The discovery of mRNA interferases: implication in bacterial physiology and application to biotechnology. Journal of Cell Physiology 209: 670–676.

Itakura K, Hirose T, Crea R et al. (1977) Expression in E. coli of a chemically synthesized gene for the hormone somatostatin. Science 198: 1056–1063.

Ito K, Uno M and Nakamura Y (2000) A tripeptide ‘anticodon’ deciphers stop codons in messenger RNA. Nature 403: 680–684.

Karlin S, Mrazek J, Campbell A et al. (2001) Characterizations of highly expressed genes of four fast‐growing bacteria. Journal of Bacteriology 183: 5025–5040.

Kimchi‐Sarfaty C, Oh JM, Kim IW et al. (2007) A “silent” polymorphism in the MDR1 gene changes substrate specificity. Science 315: 525–528.

Letsinger RL and Mahadevan V (1965) Oligonucleotide synthesis on a polymer support. Journal of American Chemical Society 87: 3526–3527.

Lin Y, Cheng G, Wang X and Clark TG (2002) The use of synthetic genes for the expression of ciliate proteins in heterologous systems. Gene 288: 85–94.

Mullis KB and Faloona FA (1987) Specific synthesis of DNA in vitro via a polymerase‐catalyzed chain reaction. Methods in Enzymology 155: 335–350.

Nackley AG, Shabalina SA, Tchivileva IE et al. (2006) Human catechol‐O‐methyltransferase haplotypes modulate protein expression by altering mRNA secondary structure. Science 314: 1930–1933.

Ogle JM and Ramakrishnan V (2005) Structural insights into translational fidelity. Annual Review of Biochemistry 74: 129–177.

Plant EP, Jacobs KL, Harger JW et al. (2003) The 9‐A solution: how mRNA pseudoknots promote efficient programmed ‐1 ribosomal frameshifting. RNA 9: 168–174.

Plotkin JB, Robins H and Levine AJ (2004) Tissue‐specific codon usage and the expression of human genes. Proceedings of the National Academy of Sciences of the USA 101: 12588–12591[Epub 12004 Aug 12516].

Rouillard JM, Lee W, Truan G et al. (2004) Gene2Oligo: oligonucleotide design for in vitro gene synthesis. Nucleic Acids Research 32: W176–180.

Sandhu GS, Aleff RA and Kline BC (1992) Dual asymmetric PCR: one‐step construction of synthetic genes. Biotechniques 12: 14–16.

Seligmann H and Pollock DD (2004) The ambush hypothesis: hidden stop codons prevent off‐frame gene reading. DNA Cell Biology 23: 701–705.

Selmer M, Dunham CM, Murphy FV et al. (2006) Structure of the 70S ribosome complexed with mRNA and tRNA. Science 313: 1935–1942.

Sharp PM and Matassi G (1994) Codon usage and genome evolution. Current Opinion in Genetic Development 4: 851–860.

Shine J and Dalgarno L (1974) The 3′‐terminal sequence of E. coli 16S ribosomal RNA: complementarity to nonsense triplets and ribosome binding sites. Proceedings of the National Academy of Sciences of the USA 71: 1342–1346.

Somogyi P, Jenner AJ, Brierley I et al. (1993) Ribosomal pausing during translation of an RNA pseudoknot. Molecular and Cellular Biology 13: 6931–6940.

Stewart L and Burgin AB (2005) Whole gene synthesis: a gene‐o‐matic future. In: Atta‐ur‐Rahman B, Springer A and Caldwell GW (eds) Frontiers in Drug Design and Discovery, pp. 297–341. San Francisco, CA: Bentham Science Publishers.

Watanabe K (2002) Genetic code: introduction. Encyclopedia of Life Sciences 1–10.

Winter PC (2005) Polymerase chain reaction. Encyclopedia of Life Sciences 1–5.

Withers‐Martinez C, Carpenter EP, Hackett F et al. (1999) PCR‐based gene synthesis as an efficient approach for expression of the A+T‐rich malaria genome. Protein Engineering 12: 1113–1120.

Young L and Dong Q (2004) Two‐step total gene synthesis method. Nucleic Acids Research 32: e59.

Contact Editor close
Submit a note to the editor about this article by filling in the form below.

* Required Field

How to Cite close
Stewart, Lance(Sep 2007) Gene Synthesis for Protein Production. In: eLS. John Wiley & Sons Ltd, Chichester. [doi: 10.1002/9780470015902.a0020211]