Image Analysis Tools in Proteomics

Abstract

Image analysis tools have matured into a number of established commercial packages and freely available programs that underpin research in expression proteomics. With respect to the available tools, we describe the challenges and methods for image analysis in two‐dimensional gel electrophoresis and the emerging high‐throughput ‘shotgun’ proteomics platform of liquid chromatography coupled to mass spectrometry. Two workflows can be identified: the established pipeline of spot detection followed by spot matching, and the new alternative pipeline of image alignment followed by consensus spot detection. In practice, image analysis is often viewed as a major bottleneck in proteomics, and we will further touch on emerging research that aims to statistically model and integrate data earlier in the pipeline to minimise manual interaction and therefore the subjective bias and restrictions that it brings.

Key Concepts:

  • Precise quantification and differential analysis of protein expression is reliant on semiautomated image analysis tools.

  • Liquid chromatography/mass spectrometry (LC/MS) datasets can be processed in a similar manner to 2‐D gel electrophoresis (2‐DE) by conversion to images with the mass to charge ratio on one axis and retention time on the other.

  • The conventional image analysis workflow involves detection of protein/peptide spots on each dataset followed by matching of corresponding spots between each dataset and a designated reference.

  • An alternative workflow has emerged, which warps the images into alignment before determining a consensus spot detection on the set of images as a whole.

  • It is important for all researchers to understand the differing unique challenges for analysis tools posed by 2‐DE and LC/MS, such as varying operating conditions, artefacts and contaminants.

  • LC/MS analyses also include peptide deisotoping and charge state estimation steps, plus computation of normalised retention times for improved peptide identification.

  • Sound statistical treatment of the results is key, including the use of the false discovery rate correction for overcoming the multiple hypothesis problem.

  • With existing approaches, errors at each stage propagate and amplify as the pipeline progresses; therefore, emerging techniques are aiming to provide an integrated consensus analysis that borrows strength across the image set.

  • Consensus pinnacle detection and image‐based alignment and deconvolution are novel methods that can provide automated, reliable and robust quantification.

  • Functional data analysis presents a new statistical paradigm for image‐based differential detection able to compensate for multiple experimental factors and discover protein regulation in comigrated regions missed by spot‐centric approaches.

Keywords: proteomics; two‐dimensional electrophoresis; liquid chromatography; mass spectrometry; image analysis; proteome informatics; spot detection; spot matching; chromatogram alignment; image registration

Figure 1.

Comparison of the spot detection and matching workflow of Melanie with the image alignment and consensus spot detection workflow of Progenesis SameSpots. Reproduced with permission from Dowsey et al. . Copyright Wiley‐VCH Verlag GmbH & Co. KGaA.

Figure 2.

Illustration of the profile (left) and 3D view (right) features in Melanie 4.

Figure 3.

Qualitative and quantitative comparisons in Melanie 4. Images can be shown in stacked and/or tiled modes. The dual‐channel display is shown at the lower left corner of the program window. Selected objects (highlighted in green on the screen) can be viewed on gels, histograms and reports.

Figure 4.

MSight workflow. (a) The 3D view highlights aid quality control of the input data and results. The alignment procedure is based on the use of landmarks to compensate for differences in elution time or migration distance. Small letters ‘a’ to ‘e’ are potential landmarks. (b) Thus, the peak detection algorithm looks for areas of high‐intensity peaks to delineate their shapes. The deisotoping step then looks for the monoisotopic peaks of the same molecule, links them together (dashed lines connect isotopes) and determines ion charge states. Reproduced with permission from Dowsey et al. . Copyright Wiley‐VCH Verlag GmbH & Co. KGaA.

close

References

Almeida JS, Stanislaus R, Krug E and Arthur JM (2005) Normalization and analysis of residual variation in two‐dimensional gel electrophoresis for quantitative differential proteomics. Proteomics 5(5): 1242–1249.

America AHP and Cordewener JHG (2008) Comparative LC‐MS: a landscape of peaks and valleys. Proteomics 8(4): 731–749.

Anderle M, Roy S, Lin H, Becker C and Joho K (2004) Quantifying reproducibility for differential proteomics: noise analysis for protein liquid chromatography‐mass spectrometry of human serum. Bioinformatics 20(18): 3575–3582.

Andreev VP, Li L, Cao L et al. (2007) A new algorithm using cross‐assignment for label‐free quantitation with LC/LTQ‐FT MS. Journal of Proteome Research 6(6): 2186–2194.

Appel RD, Palagi PM, Walther D et al. (1997a) Melanie II – a third‐generation software package for analysis of two‐dimensional electrophoresis images: I. Features and user interface. Electrophoresis 18(15): 2724–2734.

Appel RD, Vargas JR, Palagi PM, Walther D and Hochstrasser DF (1997b) Melanie II – a third‐generation software package for analysis of two‐dimensional electrophoresis images: II. Algorithms. Electrophoresis 18(15): 2735–2748.

Berth M, Moser F, Kolbe M and Bernhardt J (2007) The state of the art in the analysis of two‐dimensional gel electrophoresis images. Applied Microbiology and Biotechnology 76(6): 1223–1243.

Biron DG, Brun C, Lefevre T et al. (2006) The pitfalls of proteomics experiments without the correct use of bioinformatics tools. Proteomics 6(20): 5577–5596.

Coombes KR, Tsavachidis S, Morris JS et al. (2005) Improved peak detection and quantification of mass spectrometry data acquired from surface‐enhanced laser desorption and ionization by denoising spectra with the undecimated discrete wavelet transform. Proteomics 5(16): 4107–4117.

Dijkstra M, Vonk RJ and Jansen RC (2007) SELDI‐TOF mass spectra: a view on sources of variation. Journal of Chromatography. B. Analytical Technologies in the Biomedical and Life Sciences 847(1): 12–23.

Dowsey AW, Dunn MJ and Yang G (2004) ProteomeGRID: towards a high‐throughput proteomics pipeline through opportunistic cluster image computing for two‐dimensional gel electrophoresis. Proteomics 4(12): 3800–3812.

Dowsey AW, English J, Lisacek F et al. (2010) Image analysis tools and emerging algorithms for expression proteomics. Proteomics 10(23): 4226–4257.

Dowsey AW, English J, Pennington K et al. (2006) Examination of 2‐DE in the human proteome organisation brain proteome project pilot studies with the new RAIN gel matching technique. Proteomics 6(18): 5030–5047.

Dowsey AW, Dunn MJ and Yang GZ (2008b) Automated image alignment for 2D gel electrophoresis in a high‐throughput proteomics pipeline. Bioinformatics 24(7): 950–957.

Dowsey AW and Yang G (2009) Automatic alignment, statistical restoration and quantification of raw LC/MS and 2‐DE data. Proceedings of 8th Annual World Congress of the Human Proteome Organisation (HUPO), Toronto, Canada, p. C523.

Dowsey AW and Yang GZ (2008a) The future of large‐scale collaborative proteomics. Proceedings of the IEEE 96(8): 1292–1309.

Efrat A, Hoffmann F, Kriegel K, Schultz C and Wenk C (2002) Geometric algorithms for the analysis of 2D‐electrophoresis gels. Journal of Computational Biology 9(2): 299–315.

Fischer B, Roth V and Buhmann JM (2009) Adaptive bandwidth selection for biomarker discovery in mass spectrometry. Artificial Intelligence in Medicine 45(2–3): 207–214.

Gibson F, Anderson L, Babnigg G et al. (2008) Guidelines for reporting the use of gel electrophoresis in proteomics. Nature Biotechnology 26(8): 863–864.

Gustafsson JS, Ceasar R, Glasbey CA, Blomberg A and Rudemo M (2004) Statistical exploration of variation in quantitative two‐dimensional gel electrophoresis data. Proteomics 4(12): 3791–3799.

Julka S and Regnier FE (2005) Recent advancements in differential proteomics based on stable isotope coding. Briefings in Functional Genomics & Proteomics 4(2): 158–177.

Karp NA, McCormick PS, Russell MR and Lilley KS (2007) Experimental and statistical considerations to avoid false conclusions in proteomics studies using differential in‐gel electrophoresis. Molecular & Cellular Proteomics 6(8): 1354–1364.

Klose J (1975) Protein mapping by combined isoelectric focusing and electrophoresis of mouse tissues. Humangenetik 26(3): 231–243.

Levänen B and Wheelock AM (2009) Troubleshooting image analysis in 2DE. Methods in Molecular Biology (Clifton, N.J.) 519: 113–129.

Li J (2002) Comparison of the capability of peak functions in describing real chromatographic peaks. Journal of Chromatography A 952(1–2): 63–70.

Li X, Yi EC, Kemp CJ, Zhang H and Aebersold R (2005) A software suite for the generation and comparison of peptide arrays from sets of data collected by liquid chromatography‐mass spectrometry. Molecular & Cellular Proteomics 4(9): 1328–1340.

Martens L, Chambers M, Sturm M et al. (in press) MzML – a community standard for mass spectrometry data. Molecular & Cellular Proteomics.

Miguel AC, Kearney‐Fischer M, Keane J et al. (2007) Near‐lossless compression of mass spectra for proteomics. Proceedings of 32nd IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Honolulu, Hawaii, pp. 369–372.

Miura K (2003) Imaging technologies for the detection of multiple stains in proteomics. Proteomics 3(7): 1097–1108.

Morris JS, Baladandayuthapani JS, Clark BN, Wei W and Gutstein HB (2010) Evaluating the performance of new approaches to spot quantification and differential expression in 2‐dimensional gel electrophoresis studies. Journal of Proteome Research 9(1): 595–604.

Morris JS, Baladandayuthapani VB, Herrick RC, Sanna P and Gutstein HB (in press) Automated analysis of quantitative image data using isomorphic functional mixed models with applications to proteomics data. Annals of Applied Statistics.

Morris JS, Brown PJ, Herrick RC, Baggerly KA and Coombes KR (2008b) Bayesian analysis of mass spectrometry proteomic data using wavelet‐based functional mixed models. Biometrics 64(2): 479–489.

Morris JS and Carroll RJ (2006) Wavelet‐based functional mixed models. Journal of the Royal Statistical Society. Series B, Statistical Methodology 68(2): 179–199.

Morris JS, Clark BN and Gutstein HB (2008a) Pinnacle: a fast, automatic and accurate method for detecting and quantifying protein spots in 2‐dimensional gel electrophoresis data. Bioinformatics 24(4): 529–536.

Morris JS, Coombes KR, Koomen J, Baggerly KA and Kobayashi R (2005) Feature extraction and quantification for mass spectrometry in biomedical applications using the mean spectrum. Bioinformatics 21: 1764–1775.

Mueller LN, Rinner O, Schmidt A et al. (2007) SuperHirn – a novel tool for high resolution LC‐MS‐based peptide/protein profiling. Proteomics 7(19): 3470–3480.

O'Farrell PH (1975) High resolution two‐dimensional electrophoresis of proteins. Journal of Biological Chemistry 250(10): 4007–4021.

Palagi PM, Walther D, Quadroni M et al. (2005) MSight: an image analysis software for liquid chromatography‐mass spectrometry. Proteomics 5(9): 2381–2384.

Pleissner K, Hoffmann F, Kriegel K et al. (1999) New algorithmic approaches to protein spot detection and pattern matching in two‐dimensional electrophoresis gel databases. Electrophoresis 20(4–5): 755–765.

Prakash A, Mallick P, Whiteaker J et al. (2006) Signal maps for mass spectrometry‐based comparative proteomics. Molecular & Cellular Proteomics 5(3): 423–432.

Rogers M and Graham J (2007) Robust and accurate registration of 2‐D electrophoresis gels using point‐matching. IEEE Transactions on Image Processing 16(3): 624–635.

Rogers M, Graham J and Tonge RP (2003) Statistical models of shape for the analysis of protein spots in two‐dimensional electrophoresis gel images. Proteomics 3(6): 887–896.

Senko MW, Beu SC and McLafferty FW (1995) Determination of monoisotopic masses and ion populations for large biomolecules from resolved isotopic distributions. Journal of the American Society for Mass Spectrometry 6(4): 229–233.

Shin H, Mutlu M, Koomen JM and Markey MK (2007) Parametric power spectral density analysis of noise from instrumentation in MALDI TOF mass spectrometry. Cancer Informatics 3: 317–328.

Unlü M, Mary EM and Jonathan SM (1997) Difference gel electrophoresis. A single gel method for detecting changes in protein extracts. Electrophoresis 18(11): 2071–2077.

Vandenbogaert M, Li‐Thiao‐Té S, Kaltenbach H et al. (2008) Alignment of LC‐MS images, with applications to biomarker discovery and protein identification. Proteomics 8(4): 650–672.

Veeser S, Dunn MJ and Yang G (2001) Multiresolution image registration for two‐dimensional gel electrophoresis. Proteomics 1(7): 856–870.

Vestal ML (2009) Modern MALDI time‐of‐flight mass spectrometry. Journal of Mass Spectrometry 44(3): 303–317.

Wang P, Tang H, Fitzgibbon MP et al. (2007) A statistical method for chromatographic alignment of LC‐MS data. Biostatistics 8(2): 357–367.

Wheelock AM and Buckpitt AR (2005) Software‐induced variance in two‐dimensional gel electrophoresis image analysis. Electrophoresis 26(23): 4508–4520.

Wolters DA, Washburn MP and Yates JR (2001) An automated multidimensional protein identification technology for shotgun proteomics. Analytical Chemistry 73(23): 5683–5690.

Zimmer JS, Monroe ME, Qian W and Smith RD (2006) Advances in proteomics data analysis and display using an accurate mass and time tag approach. Mass Spectrometry Reviews 25(3): 450–482.

Further Reading

Clark BN and Gutstein HB (2008) The myth of automated, high‐throughput two‐dimensional gel analysis. Proteomics 8(6): 1197–1203.

Dowsey AW, Dunn MJ and Yang GZ (2003) The role of bioinformatics in two‐dimensional gel electrophoresis. Proteomics 3(8): 1567–1596.

Dowsey AW, Morris JS, Gutstein HB and Yang G (2010) Informatics and statistics for analyzing 2‐D gel electrophoresis images. In: Hubbard SJ and Jones AR (eds) Methods in Molecular Biology, vol. 604, pp. 239–255. New York: Humana Press.

Gutstein HB and Morris JS (2007) Laser capture sampling and analytical issues in proteomics. Expert Review of Proteomics 4(5): 627–637.

Gutstein HB, Morris JS, Annangudi SP and Sweedler JV (2008) Microproteomics: analysis of protein diversity in small samples. Mass Spectrometry Reviews 27: 316–330.

Morris JS, Baggerly KA, Gutstein HB and Coombes KR (2010) Statistical contributions to proteomic research. Methods in Molecular Biology 641: 143–166.

Mueller LN, Brusniak M, Mani DR and Aebersold R (2008) An assessment of software solutions for the analysis of mass spectrometry based quantitative proteomics data. Journal of Proteome Research 7(1): 51–61.

Wilkins MR, Williams KL, Appel RD and Hochstrasser DF (eds) (1997) Proteome Research: New Frontiers in Functional Genomics. Heidelberg: Springer‐Verlag.

Contact Editor close
Submit a note to the editor about this article by filling in the form below.

* Required Field

How to Cite close
Dowsey, Andrew W, English, Jane A, Lisacek, Frederique, Morris, Jeffrey S, Yang, Guang‐Zhong, and Dunn, Michael J(Jan 2011) Image Analysis Tools in Proteomics. In: eLS. John Wiley & Sons Ltd, Chichester. http://www.els.net [doi: 10.1002/9780470015902.a0006216.pub2]