Microarray Bioinformatics


Scientists deal with massive datasets that must be interpreted in the context of other databases with biomedical knowledge. HGMD and NCBI GEO repositories accumulate vast amounts of data generated arrays and next‐generation sequencing (NGS) techniques. Although there is no universal way to analyse microarray data, best practices have been highlighted by which these data obtained from multiplex measurements using microarrays are explored.

Variant analysis, copy number aberrations and RNA and proteome expression analysis using Somalogic aptamers on microarrays involve the processing of large numeric data tables linked to literature/text, categorical sequence data and last but not least unstructured clinical data of patients. Its aim is to rank a list of variants, genes (RNA transcripts) and/or proteins that are expressed differentially (including coregulated and/or antiregulated molecules) and in a biologically relevant and statistically significant manner under many different experimental conditions or disease states. Visual data‐mining approaches are used besides advanced R statistical methods to identify unexpected relationships beyond the original hypotheses for clinical decision‐making in research or therapeutic settings. Software tools such as Instem/OmniViz Tibco/Spotfire, Partek, Biodiscovery Nexus, Qiagen/Ingenuity Pathway Analysis and/or Qiagen/Ingenuity Variant Analysis help scientists to put their scientific findings in the right context to stratify patients for targeted treatment.

Key Concepts

  • Bioinformatics uses computational tools for integration of the analysis of biological and medical data, which are being discussed in this article.
  • Bioinformatics covers software to handle, acquire, store, integrate, archive, analyse and visualise OMIC data. As the field matures, the focus shifts from single experiments towards cross‐OMIC data integration of DNA, RNA and protein array and sequencing data.
  • Detection of mutations, deletions and amplifications.
  • Patient stratification in Primary Immunodeficiencies.
  • Targeted therapy in AML.
  • DNA, RNA OMICS integrative analysis.
  • Web based resources for genome analysis.

Keywords: bioinformatics; gene expression analysis; patient stratification; molecular diagnostics; microarray; data analysis

Figure 1. Linking clinical data with molecular data for personalised treatment. Correlation view of specimens from 285 patients with AML showing an adapted correlation view (1444 probe sets). The correlation displays pairwise correlations between the samples. The colours of the cells relate to Pearson correlation coefficient values, with deeper colours indicating higher positive (red) or negative (blue) correlations.
Figure 2. The Instem/OmniViz data‐mining software is used here to integrate dynamic analyses of multiple data sources. The microarray analysis (upper left) identifies novel, unforeseen relationships, but deciding which of the behaviours is worth pursuing requires further data. In this case, the microarray analysis is linked with investigation of protein expression (upper right), analysis of structures and high‐throughput screening of compounds known to interact with related targets (lower left) and contextual analysis of the vast scientific literature (lower right). Through visualising and automatic linking of these analyses simultaneously, it is possible to confirm known relationships and discover novel ones in the data.


Al‐Shahrour F, Minguez P, Tárraga J, et al. (2006) BABELOMICS: a systems biology perspective in the functional annotation of genome‐scale experiments. Nucleic Acids Research 34: W472–W476.

Birkland A and Yona G (2006) BIOZON: a system for unification, management and analysis of heterogeneous biological data. BMC Bioinformatics 7: 70.

Boguski MS and Jones AR (2004) Neurogenomics: at the intersection of neurobiology and genome sciences. Nature Neuroscience 7: 429–433.

Colangeli R, Helb D, Vilchèze C, et al. (2007) Transcriptional regulation of multi‐drug tolerance and antibiotic‐induced responses by the histone‐like protein Lsr2 in M. tuberculosis. PLoS Pathogens 6: e87.

Crowther DJ (2002) Applications of microarrays in the pharmaceutical industry. Current Opinion in Pharmacology 2 (5): 551–554.

Datta S and Datta S (2006) Evaluation of clustering algorithms for gene expression data. BMC Bioinformatics 7: S16.

Gautier L, Cope L, Bolstad BM and Irizarry RA (2004) Affy – analysis of Affymetrix GeneChip data at the probe level. Bioinformatics 20 (3): 307–315.

Golub TR, Slonim DK, Tamayo P, et al. (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286: 531–537.

Graeber TG and Eisenberg D (2001) Bioinformatic identification of potential autocrine signaling loops in cancers from gene expression profiles. Nature 29: 295–300.

Heng TS and Painter MW (2008) immunological genome project: networks of gene expression in immune cells. Nature Immunology 9 (10): 1091–1094.

Lee ML, Kuo FC, Whitmore GA and Sklar J (2000) Importance of replication in microarray gene expression studies: statistical methods and evidence from repetitive cDNA hybridizations. Proceedings of the National Academy of Sciences of the United States of America 97: 9834–9839.

Lim WK, Wang K, Lefebvre C and Califano A (2007) Comparative analysis of microarray normalization procedures: effects on reverse engineering gene networks. Bioinformatics 23: i282–i288.

Marcotte EM, Pellegrini M, Thompson MJ, Yeates TO and Eisenberg D (1999) A combined algorithm for genome‐wide prediction of protein function. Nature 402: 83–86.

Marton MJ, DeRisi JL, Bennett HA, et al. (1998) Drug target validation and identification of secondary drug target effects using DNA microarrays. Nature Medicine 4: 1293–1301.

Mostafavi S, Yoshida H, Moodley D, et al. (2016) Parsing the interferon transcriptional network and its disease associations. Cell 164 (3): 564–578.

van Osdol WW, Myers TG, Paull KD, Kohn KW and Weinstein JN (1994) Use of the Kohonen self‐organizing map to study the mechanisms of action of chemotherapeutic agents. Journal of the National Cancer Institute 86: 1853–1859.

Pelizzola M, Pavelka N, Foti M and Ricciardi‐Castagnoli P (2006) AMDA: an R package for the automated microarray data analysis. BMC Bioinformatics 7: 335.

Ross DT, Scherf U, Eisen MB, et al. (2000) Systematic variation in gene expression patterns in human cancer cell lines. Nature Genetics 24: 227–235.

Scherf U, Ross DT, Waltham M, et al. (2000) A gene expression database for the molecular pharmacology of cancer. Nature Genetics 24: 236–244.

Shankavaram UT, Reinhold WC, Nishizuka S, et al. (2007) Transcript and protein expression profiles of the NCI‐60 cancer cell panel: an integromic microarray study. Molecular Cancer Therapeutics 6 (3): 820–832.

Tagliafico E, Tenedini E, Manfredini R, et al. (2006) Identification of a molecular signature predictive of sensitivity to differentiation induction in acute myeloid leukemia. Leukemia 20: 1751–1758.

Tusher VG, Tibshirani R and Chu G (2001) Significance analysis of microarrays applied to the ionizing radiation response. Proceedings of the National Academy of Sciences of the United States of America 98: 5116–5121.

Valk PJ, Verhaak RG, Beijen MA, et al. (2004) Prognostically useful gene‐expression profiles in acute myeloid leukemia. New England Journal of Medicine 350 (16): 1617–1628.

Vardhanabhuti S, Blakemore SJ, Clark SM, et al. (2006) A comparison of statistical tests for detecting differential expression using Affymetrix oligonucleotide microarrays. OMICS 10: 555–566.

van't Veer LJ, Dai H, van de Vijver MJ, et al. (2002) Gene expression profiling predicts clinical outcome of breast cancer. Nature 415: 530–536.

Wolfe CJ, Kohane IS and Butte AJ (2005) Systematic survey reveals general applicability of “guilt‐by‐association” within gene coexpression networks. BMC Bioinformatics 6: 227.

Zeng H and Sanes JR (2017) Neuronal cell‐ type classification: challenges, opportunities and the path forward. Nature Reviews Neuroscience 18 (9): 530–546.

Further Reading

Brazma A, Hingamp P, Quackenbush J, et al. (2001) Minimum information about a microarray experiment (MIAME) – toward standards for microarray data. Nature Genetics 29: 365–371.

Eisen MB, Spellman PT, Brown PO and Botstein D (1998) Cluster analysis and display of genome‐wide expression patterns. Proceedings of the National Academy of Sciences of the United States of America 95: 14863–14868.

Gewin V (2005) A golden age of brain exploration. PLoS Biology 3 (1): e24. (http://biology.plosjournals.org/perlserv?request=getdocument&doi=10.1371/journal.pbio.0030024) doi:10.1371/journal.pbio.0030024 (10.1371%2Fjournal.pbio.0030024). PMC 544547 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC544547). PMID 15660159 (https://www.ncbi.nlm.nih.gov/pubmed/15660159).

Lee Y, Sultana R, Pertea G, et al. (2002) Cross‐referencing eukaryotic genomes: TIGR orthologous gene alignments (TOGA). Genome Research 12: 493–502.

Monks A, Scudiero DA, Johnson GS, Paull KD and Sausville EA (1997) The NCI anti‐cancer drug screen: a smart screen to identify effectors of novel targets. Anti‐cancer Drug Design 12: 533–541.

Musumarra G, Condorelli DF, Scire S and Costa AS (2001) Shortcuts in genome‐scale cancer pharmacology research from multivariate analysis of the National Cancer Institute gene expression database. Biochemical Pharmacology 62: 547–553.

Stevens R, Tipney HJ, Wroe C, et al. (2004) Exploring Williams–Beuren syndrome using myGrid. Bioinformatics 20: i303–i310.

Wen X, Fuhrman S, Michaels GS, et al. (1998) Large‐scale temporal gene expression mapping of central nervous system development. Proceedings of the National Academy of Sciences of the United States of America 95: 334–339.

Web Links

Affymetrix: https://www.thermofisher.com/us/en/home/life‐science/microarray‐analysis.html.

Allen Brain atlas http://www.brain‐map.org/.

ArrayExpress https://www.ebi.ac.uk/arrayexpress/.

Bioconductor http://www.bioconductor.org/.

Biodiscovery Nexus http://www.biodiscovery.com/nexus‐copy‐number/.

BioMoby: http://biomoby.open‐bio.org/.

BRB ArrayTools http://linus.nci.nih.gov/pilot/index.htm.

CBIIT: https://cbiit.cancer.gov/.

DAVID http://david.abcc.ncifcrf.gov/.

Developmental Therapy Program (DTP) NCI/NIH https://dtp.cancer.gov/.

Ensembl http://www.ensembl.org/index.html.

European Bioinformatics Institute (EBI) http://www.ebi.ac.uk/.

Galaxy https://usegalaxy.org/workflow/list_published.

Genecards: http://www.genecards.org/.

Gene Ontology Consortium http://www.geneontology.org/.

GeneGo Metacore http://lsresearch.thomsonreuters.com/.

GenMAPP http://www.genmapp.org/.

GEO http://www.ncbi.nlm.nih.gov/geo/.

GoMiner http://discover.nci.nih.gov/gominer/.

iHOP http://www.ihop‐net.org/UniPub/iHOP/.

IMMgen http://www.immgen.com.

Ingenuity/Qiagen https://www.qiagenbioinformatics.com/.

Kyoto Encyclopedia of Genes and Genomes (KEGG) http://www.genome.ad.jp/kegg/.

limmaGUI http://bioinf.wehi.edu.au/limmaGUI/.

Mouse Genome Informatics (MGI) http://www.informatics.jax.org/.

myGrid http://www.mygrid.org.uk/?&MMN_position=1:1.

National Center for Biotechnology Information (NCBI) http://www.ncbi.nlm.nih.gov/.

OMIM http://www.ncbi.nlm.nih.gov/sites/entrez?db=OMIM.

OmniViz/Instem http://www.instem.com/.

Partek http://www.partek.com/.

R project http://www.r‐project.org/.

Reactome http://reactome.org.

Seahawk applet http://biomoby.open‐bio.org/CVS_CONTENT/moby‐live/Java/docs/Seahawk.html.

Spotfire https://spotfire.tibco.com/.

Taverna project http://www.taverna.org.uk/.

The Institute for Genomic Research (TIGR) http://www.jcvi.org/cms/home/.

Contact Editor close
Submit a note to the editor about this article by filling in the form below.

* Required Field

How to Cite close
Suratannon, Narissara, van Hagen, Martin, and van der Spek, Peter J(Jan 2018) Microarray Bioinformatics. In: eLS. John Wiley & Sons Ltd, Chichester. http://www.els.net [doi: 10.1002/9780470015902.a0005957.pub3]