Expression Tags for Protein Production


The fusion of a protein of interest to an expression tag(s), called the fusion‐protein approach, has been utilized in the forefront of modern biology and medicine for structural and functional analysis.

Keywords: expression tags; fusion‐protein approach; parallel cloning

Figure 1.

Expression tags are the key component of the fusion protein approach. Three different expression tags are indicated as A (oval), B (oval) or C (circle). They can be fused to either N‐ or C‐termini of native target proteins. Since different expression tags render different functions, dual‐ or multitags can be fused to a single target protein to gain more than one function.

Figure 2.

Green fluorescence protein (GFP) is used as a report tag for proper folding of fusion proteins. GFP is fused to the C‐termini of a native target protein. When the target proteins are properly folded (i.e. become soluble or inserted into lipid bilayer), GFP also folds properly and emits bright green fluorescence under (UV) light. Visualization of GFP‐tagged fusion proteins was carried out by taking fluorescent images of the living E. coli cells by a fluorescent microscope.

Figure 3.

Schematic representation of trans and cis cleavage of fusion proteins. The nucleotide sequence encoding the amino acid sequence of TEVP cleavage site is shown here. (a) The trans method. For in vitro cleavage, purified TEVP is mixed with the purified fusion proteins at a molar ratio of approximately 1/50–1/200. Trans cleavage can also be carried out in vitro by expressing the fusion protein and TEVP as two separate polypeptides in the same cell. (b) The cis TEVP method. A fusion protein contains an expression tag, TEVP, TEVP cleavage site, and target protein expressed in a host cell. In this case, intracellular self processing of fusion protein occurs and yields a native target protein.

Figure 4.

Schematic representation of four different strategies for parallel cloning fusion proteinexpression vectors. (a) TA‐TOPO cloning. Tag polymerase has nontemplate‐dependent TA that adds a single deoxyadenosine (A) to the 3′ ends of PCR products. The linearized T vector has single, overhanging 3′ deoxythymidine (T) residues. This allows PCR inserts to ligate efficiently with the vector. (b) Sticky‐end PCR cloning method for cloning a His6‐SUMO dual fusion protein. Two separate PCR reactions were carried out using one forward primer and two reverse primers. The two PCR products were mixed and were 5′‐end phosphorylated by T4 polynucleotide kinase. After denaturing at 95°C and annealing at 65°C, ˜50% of the products carried 5′ blunt end and 3′ EcoRI ends, and were ready for ligation into vectors. The vector shown here was engineered to carry an Sfo I site spanning the end of SUMO. The final expression plasmid can be used to produce recombinant His6‐SUMO fused target protein, and such a recombinant protein can be cleaved Ulp1 (SUMO protease) and release the native target protein with methionine at its NH2‐end next to the Gly Gly end of SUMO. (c) (LIC) method. The PCR product of target gene was digested by T4 DNA polymerase (3′ to 5′ exonuclease) in the presence of dATP to generate single‐strand overhanging. The single‐stranded overhang allowed the target gene DNA to anneal with the LIC vector with complementary overhangs. The annealed product without ligation reaction was transformed directly into E. coli, in which covalent bonds formed at the vector‐insert junctions to yield circular plasmids. This expression vector was designed to produce fusion proteins with an enterokinase cleavage site. After enterokinase cleavage, the fusion protein yielded a protein with native N‐terminal amino acid sequence. (d) (RC). Donor vector with attL recombined with a destination vector with attR to form a new expression clone with attB and a byproduct with attP.



Agashe VR, Guha S, Chang H‐C et al. (2004) Function of trigger factor and DnaK in multidomain protein folding increases in yield at the expense of folding speed. Cell 117: 199–209.

Arechaga I, Miroux B, Karrasch S et al. (2000) Characterisation of new intracellular membranes in Escherichia coli accompanying large scale over‐production of the b subunit of F1F0 ATP synthase. FEBS Letters 482: 215–219.

Drew DE, von Heijne G, Nordlund P and de Gier JL (2001) Green fluorescent protein as an indicator to monitor membrane protein overexpression in E. coli. FEBS Letters 507: 220–224.

Hartley JL, Temple GF and Brasch MA (2000) DNA cloning using in vitro site‐specific recombination. Genome Research 10: 1788–1795.

Kapust RB and Waugh DS (2000) Controlled intracellular processing of fusion proteins by TEV protease. Protein Expression and Purification 19: 312–318.

Lesley SA (2001) High‐throughput proteomics: protein expression and purification in the postgenomic world. Protein Expression and Purification 22: 159–164.

Liu Q, Li MZ, Leibham D, Cortez D and Elledge SJ (1999) The univector plasmid‐fusion system, a method for rapid construction of recombinant DNA without restriction enzymes. Current Biology 8: 1300–1309.

Marblestone JG, Edavettal SC and Butt TR (2006) Comparison of SUMO fusion technology with traditional gene fusion systems: enhanced expression and solubility with SUMO. Protein Science 15: 182–189.

Roosild TP, Greenwald J, Vega M et al. (2005) NMR structure of Mistic, a membrane‐integrating protein for membrane protein expression. Science 307: 1317–1321.

Sakhamuru K, Hough DW and Chaudhuri JB (2000) Protein purification by ultrafiltration using a β‐galactosidase fusion tag. Biotechnology Progress 16: 296–298.

Sambrook J and Russell DW (2000) Molecular Cloning: A Laboratory Manual. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press.

Shih YP, Kung WM, Chen JC et al. (2002) High‐throughput screening of soluble recombinant protein. Protein 11: 1714–1719.

Shih YP, Wu HC, Hu SM et al. (2005) Self‐cleavage of fusion protein in vivo using TEV protease to yield native protein. Protein Science 14: 936–941.

Smyth DR, Mrozkiewicz WJ, McGrath P, Listwan P and Kobe B (2003) Crystal structures of fusion proteins with large‐affinity tags. Protein Science 12: 1313–1322.

Stevens RC (2000) Design of high‐throughput methods for protein production for structural biology. Structure 8: R177–185.

Uhl'en M, Nilsson B, Guss B et al. (1983) Gene fusion vectors based on the gene for Staphylococcal protein A. Gene 23: 369–378.

Waugh DS (2005) Making the most of affinity tags. Trends in Biotechnology 23: 316–319.

Contact Editor close
Submit a note to the editor about this article by filling in the form below.

* Required Field

How to Cite close
Hu, Su‐Ming, Wang, Andrew H‐J, and Wang, Ting‐Fang(Sep 2007) Expression Tags for Protein Production. In: eLS. John Wiley & Sons Ltd, Chichester. [doi: 10.1002/9780470015902.a0020210]