Computational Prediction of Genetic Drivers in Cancer


Cancer is a complex genetic disease driven by somatic mutations in the genomes of cancer cells. Distinguishing pathogenic ‘driver’ mutations from non‐pathogenic ‘passenger’ mutations is a central task for functionalising cancer genomics in patient care. With the outpouring of genomic information from next‐generation sequencing, predictive algorithms have become relevant to filter the outnumbered pathogenic driver mutations from non‐pathogenic passenger mutations. Computational approaches are available for predicting cancer drivers at mutation, gene and pathway levels. These algorithms use statistical approaches that have their advantages and drawbacks. The current trend is to use multiple and complementary methods for a more accurate prioritisation of cancer driver candidates available for targeted therapy at the clinical level.

Key Concepts

  • Cancer is a disease driven by mutations in the genome.
  • Only a small fraction of mutations are drivers that are responsible for cancer initiation and progression.
  • Distinguishing drivers from passengers is essential for genomic medicine.
  • Computational prediction of drivers is challenging due to the complexity of biology and genomics.
  • Statistical and machine learning approaches have been applied to discover the signature of drivers.
  • A mutation can affect the function of multiple genes and pathways.
  • The function of a mutation is context‐dependent and can vary in different diseases.

Keywords: cancer; driver; genomics; function; predictor; computational; mutation; tools; bioinformatics

Figure 1. An overview of three broader approaches for predicting potential candidate driver (mutations, genes and pathways) from somatic mutations.


1000 Genomes Project Consortium, Abecasis GR, Altshuler D, Auton A, et al. (2010) A map of human genome variation from population‐scale sequencing. Nature 467: 1061–1073.

Adzhubei IA, Schmidt S, Peshkin L, et al. (2010) A method and server for predicting damaging missense mutations. Nature Methods 7: 248–249.

Babur Ö, Gönen M, Aksoy BA, et al. (2015) Systematic identification of cancer driving signaling pathways based on mutual exclusivity of genomic alterations. Genome Biology 16: 45. DOI:10.1186/s13059-015-0612-6.

Bushman F (2013) Cancer Gene List.

Capriotti E and Altman RB (2011) A new disease‐specific machine learning approach for the prediction of cancer‐causing missense variants. Genomics 98: 310–317.

Carter H, Chen S, Isik L, et al. (2009) Cancer‐specific high‐throughput annotation of somatic mutations: computational prediction of driver missense mutations. Cancer Research 69: 6660–6667.

Castellana S and Mazza T (2013) Congruency in the prediction of pathogenic missense mutations: state‐of‐the‐art web‐based tools. Briefings in Bioinformatics 14: 448–459.

Cerami E, Demir E, Schultz N, et al. (2010) Automated network analysis identifies core pathways in glioblastoma. PLoS One 5: e8918.

Cheng WC, Chung IF, Chen CY, et al. (2014) DriverDB: an exome sequencing database for cancer driver gene identifiation. Nucleic Acids Research 42: D1048–D1054.

Choi Y, Sims GE, Sean Murphy S, et al. (2012) Predicting the functional effect of amino acid substitutions and indels. PLoS One 7: e46688.

Chun S and Fay JC (2009) Identification of deleterious mutations within three human genomes. Genome Research 19: 1553–1561.

Ciriello G, Cerami E, Sander C, et al. (2012) Mutual exclusivity analysis identifies Oncogenic network modules. Genome Research 22: 398–406.

Cooper GM, Stone EA, Asimenos G, et al. (2005) Distribution and intensity of constraint in mammalian genomic sequence. Genome Research 15: 901–913.

Creixell P, Jüri R, Haider S, et al. (2015) Pathway and network analysis of cancer genomes. Nature Methods 12: 615–621.

Davydov EV, Goode DL, Sirota M, et al. (2010) Identifying a high fraction of the human genome to be under selective constraint using GERP++. PLoS Computational Biology 6: e1001025.

Dees ND, Zhang Q, Kandoth C, et al. (2012) MuSiC: Identifying mutational signifiance in cancer genomes. Genome Research 22: 1589–1598.

Dong C, Wei P, Jian X, Gibbs R, et al. (2015a) Comparison and integration of deleteriousness prediction methods for nonsynonymous SNVs in whole exome sequencing studies. Human Molecular Genetics 24: 2125–2137.

Dong C, Yang H, He Z et al. (2015b) iCAGES: integrated CAncer GEnome Score for comprehensively prioritizing cancer driver genes in personal genomes. bioRxiv, 015008.

Forbes SA, Beare D, Gunasekaran P, et al. (2015) COSMIC: exploring the world's knowledge of somatic mutations in human cancer. Nucleic Acids Research 43: D805–D811.

Fredriksson NJ, Ny L, Nilsson JA, et al. (2014) Systematic analysis of noncoding somatic mutations and gene expression alterations across 14 tumor types. Nature Genetics 46: 1258–1263.

Fu Y, Liu Z, Lou S, et al. (2014) FunSeq2: a framework for prioritizing noncoding regulatory variants in cancer. Genome Biology 15: 480.

Garber M, Guttman M, Clamp M, et al. (2009) Identifying novel constrained elements by exploiting biased substitution patterns. Bioinformatics 25: i54–i62.

Gnad F, Baucom A, Mukhyala K, et al. (2013) Assessment of computational methods for predicting the effects of missense mutations in human cancers. BMC Genomics 14 (Suppl. 3): S7.

Gonzalez‐Perez A and Lopez‐Bigas N (2011) Improving the assessment of the outcome of nonsynonymous SNVs with a consensus deleteriousness score, Condel. American Journal of Human Genetics 88: 440–449.

Gonzalez‐Perez A and Lopez‐Bigas N (2012) Functional impact bias reveals cancer drivers. Nucleic Acids Research 40: e169.

Gonzalez‐Perez A, Perez‐Llamas C, Deu‐Pons J, et al. (2013) IntOGen‐mutations identifies cancer drivers across tumor types. Nature Methods 10: 1081–1082.

Hodis E, Watson IR, Kryukov GV, et al. (2012) A landscape of driver mutations in melanoma. Cell 150: 251–263.

Hua X, Xu H, Yang Y, et al. (2013) DrGaP: a powerful tool for identifying driver genes and pathways in cancer sequencing studies. American Journal of Human Genetics 93: 439–451.

Kaminker JS, Zhang Y, Watanabe C, et al. (2007) CanPredict: a computational tool for predicting cancer‐associated missense mutations. Nucleic Acids Research 35 (Web Server issue): W595–W598.

Kircher M, Witten DM, Jain P, et al. (2014a) A general framework for estimating the relative pathogenicity of human genetic variants. Nature Genetics 46: 310–315.

Kircher M, Witten DM, Jain P, et al. (2014b) A general framework for estimating the relative pathogenicity of human genetic variants. Nature Genetics 46: 310–315.

Kumar P, Henikoff S and Ng PC (2009) Predicting the effects of coding nonsynonymous variants on protein function using the SIFT algorithm. Nature Protocols 4: 1073–1081.

Lawrence MS, Stojanov P, Polak P, et al. (2013) Mutational heterogeneity in cancer and the search for new cancer‐associated genes. Nature 499: 214–218.

Leiserson MD, Blokh D, Sharan R, et al. (2013) Simultaneous identification of multiple driver pathways in cancer. PLoS Computational Biology 9: e1003054.

Leiserson MD, Wu H‐T, Vandin F, et al. (2015) CoMEt: a statistical approach to identify combinations of mutually exclusive alterations in cancer. Genome Biology 16: 160.

Li H, Chen H, Liu F, et al. (2015) Functional annotation of HOT regions in the human genome: implications for human disease and cancer. Scientific Reports 5: 11633.

Linghu B, Snitkin ES, Hu Z, et al. (2009) Genome‐wide prioritization of disease genes and identification of disease associations from an integrated human functional linkage network. Genome Biology 10: R91.

Liu X, Jian X and Boerwinkle E (2011) dbNSFP: a light‐weight database of human nonsynonymous SNPs and their functional predictions. Human Mutation 32: 894–899.

Liu X, Jian X and Boerwinkle E (2013) dbNSFP v2.0: a database of human non‐synonymous SNVs and their functional predictions and annotations. Human Mutation 34: E2393–E2402.

Liu Y and Hu Z (2014) Identification of collaborative driver pathways in breast cancer. BMC Genomics 15: 605.

Liu Y, Tian F, Hu Z, et al. (2015) Evaluation and integration of cancer gene classifiers: identification and ranking of plausible drivers. Science Reports 5: 10204.

Mao Y, Chen H, Liang H, et al. (2013) CanDrA: cancer‐specific driver missense mutation annotation with optimized features. PLoS One 8: e77945.

Martelotto LG, Ng CK, De Filippo MR, et al. (2014) Benchmarking mutation effect prediction algorithms using functionally validated cancer‐related missense mutations. Genome Biology 15: 484.

Marx V (2014) Cancer genomes: discerning drivers from passengers. Nature Methods 11: 375–379.

McLaren W, Pritchard B, Rios D, et al. (2010) Deriving the consequences of genomic variants with the Ensembl API and SNP Effect Predictor. Bioinformatics 26: 2069–2070.

Miller CA, Settle SH, Sulman EP, et al. (2011) Discovering functional modules by identifying recurrent and mutually exclusive mutational patterns in tumors. BMC Medical Genomics 4: 34.

Mitrea C, Taghavi Z, Bokanizad B, et al. (2013) Methods and approaches in the topology‐based analysis of biological pathways. Frontiers in Physiology 4: 278.

Pon JR and Marra MA (2015) Driver and passenger mutations in cancer. Annual Review of Pathology 10: 25–50.

Raphael BJ, Dobson JR, Oesper L, et al. (2014) Identifying driver mutations in sequenced cancer genomes: computational approaches to enable precision medicine. Genome Medicine 6: 5.

Reimand J, Wagih O and Bader GD (2013) The mutational landscape of phosphorylation signaling in cancer. Scientific Reports 3: 2651. DOI:10.1038/srep02651.

Reva B, Antipin Y and Sander C (2011) Predicting the functional impact of protein mutations: application to cancer genomics. Nucleic Acids Research 39: e118.

Ritchie GR, Dunham I, Zeggini E, et al. (2014) Functional annotation of noncoding sequence variants. Nature Methods 11: 294–296.

Rubio‐Perez C, Tamborero D, Schroeder MP, et al. (2015) In silico prescription of anticancer drugs to cohorts of 28 tumor types reveals targeting opportunities. Cancer Cell 27: 382–396.

Schroeder MP, Rubio‐Perez C, Tamborero D, et al. (2014) OncodriveROLE classifies cancer driver genes in loss of function and activating mode of action. Bioinformatics 30: i549–i555.

Schwarz JM, Cooper DN, Schuelke M, et al. (2014) MutationTaster2: mutation prediction for the deep‐sequencing age. Nature Methods 11: 361–362.

Shihab HA, Gough J, Cooper DN, et al. (2013) Predicting the functional consequences of cancer‐associated amino acid substitutions. Bioinformatics 29: 1504–1510.

Shugay M, Ortiz de Mendíbil I, Vizmanos JL, et al. (2013) Oncofuse: a computational framework for the prediction of the oncogenic potential of gene fusions. Bioinformatics 29: 2539–2546.

Smith KS, Yadav VK, Pedersen BS, et al. (2015) Signatures of accelerated somatic evolution in gene promoters in multiple cancer types. Nucleic Acids Research 43: 5307–5317.

Stratton MR, Campbell PJ and Futreal PA (2009) The cancer genome. Nature 458: 719–724.

Subramanian A, Tamayo P, Mootha VK, et al. (2005) Gene set enrichment analysis: a knowledge‐based approach for interpreting genome‐wide expression profiles. Proceedings of the National Academy of Sciences of the United States of America 102: 15545–15550.

Szczurek E and Beerenwinkel N (2014) Modeling mutual exclusivity of cancer mutations. PLoS Computational Biology 10: e1003503. doi:10.1371/journal.pcbi.1003503.

Tamborero D, Gonzalez‐Perez A, Perez‐Llamas C, et al. (2013a) Comprehensive identification of mutational cancer driver genes across 12 tumor types. Scientific Reports 3: 2650. DOI:10.1038/srep02650.

Tamborero D, Gonzalez‐Perez A and Lopez‐Bigas N (2013b) OncodriveCLUST: exploiting the positional clustering of somatic mutations to identify cancer genes. Bioinformatics 29: 2238–2244.

Tamborero D, Lopez‐Bigas N and Gonzalez‐Perez A (2013c) Oncodrive‐CIS: a method to reveal likely driver genes based on the impact of their copy number changes on expression. PLoS One 8: e55489.

Tarca AL, Draghici S, Khatri P, et al. (2009) A novel signaling pathway impact analysis. Bioinformatics 25: 75–82.

Tian R, Basu MK and Capriotti E (2014) ContrastRank: a new method for ranking putative cancer driver genes and classification of tumor samples. Bioinformatics 30: i572–i578.

UniProt Consortium (2011) Ongoing and future developments at the Universal Protein Resource. Nucleic Acids Research 39: D214–D219.

Vandin F, Upfal E and Raphael BJ (2011) Algorithms for detecting significantly mutated pathways in cancer. Journal of Computational Biology 18: 507–522.

Vandin F, Upfal E and Raphael BJ (2012) De novo discovery of mutated driver pathways in cancer. Genome Research 22: 375–385.

Vaske CJ, Benz SC, Sanborn JZ, et al. (2010) Inference of patient‐specic pathway activities from multidimensional cancer genomics data using paradigm. Bioinformatics 26: 12.

Vogelstein B, Papadopoulos N, Velculescu VE, et al. (2013) Cancer genome landscapes. Science 339: 1546–1558.

Wang XS, Prensner JR, Chen G, et al. (2009) An integrative approach to reveal driver gene fusions from paired‐end sequencing data in cancer. Nature Biotechnology 27: 1005–1011.

Wang K, Li M and Hakonarson H (2010) ANNOVAR: functional annotation of genetic variants from high‐throughput sequencing data. Nucleic Acids Research 38: e164.

Weinhold N, Jacobsen A, Schultz N, et al. (2014) Genome‐wide analysis of noncoding regulatory mutations in cancer. Nature Genetics 46: 1160–1165.

Wendl MC, Wallis JW, Lin L, et al. (2011) PathScan: a tool for discerning mutational significance in groups of putative cancer genes. Bioinformatics 27: 1595–1602.

Yang H, Robinson PN and Wang K (2015) Phenolyzer: phenotype‐based prioritization of candidate genes for human diseases. Nature Methods 12: 841–843.

Zhao J, Zhang S, Wu LY, et al. (2012) Efficient methods for identifying mutated driver pathways in cancer. Bioinformatics 28: 2940–2947.

Further Reading

Cooper GM and Shendure J (2011) Needles in stacks of needles: finding disease‐causal variants in a wealth of genomic data. Nature Reviews Genetics 12: 628–640.

Gonzalez‐Perez A, Deu‐Pons J and Lopez‐Bigas N (2012) Improving the prediction of the functional impact of cancer mutations by baseline tolerance transformation. Genome Medicine 4: 89. doi: 10.1186/gm390.

Grimm DG, Azencott CA, Aicheler F, et al. (2015) The evaluation of tools used to predict the impact of missense variants is hindered by two types of circularity. Human Mutation 36: 513–523.

Hou JP and Ma J (2013) Identifying driver mutations in cancer. In: Shen B (ed) Bioinformatics for Diagnosis, Prognosis and Treatment of Complex Diseases, vol. 4. pp. 33–56. Springer Netherlands: Springer Science+Business Media Dordrecht.

Kaminker JS, ZhangY WA, et al., et al. (2007) Distinguishing cancer‐associated missense mutations from common polymorphisms. Cancer Research 67: 465–473.

Shihab HA, Rogers MF, Gough J, et al. (2015) An integrative approach to predicting the functional effects of non‐coding and coding sequence variation. Bioinformatics 31: 1536–1543.

Zhang J, Liu J, Sun J, et al. (2014) Identifying driver mutations from sequencing data of heterogeneous tumors in the era of personalized genome sequencing. Briefings in Bioinformatics 15: 244–255.

Contact Editor close
Submit a note to the editor about this article by filling in the form below.

* Required Field

How to Cite close
Djotsa Nono, Alice B, Chen, Ken, and Liu, Xiaoming(Feb 2016) Computational Prediction of Genetic Drivers in Cancer. In: eLS. John Wiley & Sons Ltd, Chichester. [doi: 10.1002/9780470015902.a0025331]