Assessment of Disease‐Associated Sequence Variants and Considerations for Functional Validation using Mouse Models


Whole‐exome and whole‐genome sequencing approaches are rapidly becoming mainstream tools accessible to both basic researchers and clinical teams. Likewise, technological advances in genome editing, such as the CRISPR/Cas system, are poised to revolutionise model system research, making it more feasible to create animal models that truly recapitulate the human condition. However, procedures for identifying disease‐associated sequence variants are still far from robust and there are many biological variables that need to be considered when attempting to functionally validate disease‐associated variants. In this article, we highlight the many limitations and issues that should be considered at different stages throughout this process – from the filtering of sequencing data to the selection of variants, and from the selection of the model organism to the appropriate means of phenotyping.

Key Concepts

  • Researchers must appreciate the limitations of exome sequencing when considering candidate gene variants.
  • Researchers using sequencing services should ensure they receive the original BAM files of their sequencing data.
  • Common bioinformatic algorithms used to process sequencing data are predictive tools only.
  • Bioinformatic tools should not be used in isolation or their outputs taken as proof of disease causation of a variant.
  • Gene expression in a tissue consistent with that affected in patients can be used to help prioritise candidate genes but is not evidence for causation.
  • Demonstration of a functional impact of a given variant in an in vitro assay is useful but does not necessarily mean it is responsible for the disease of interest.
  • Genetic background of the mouse strain(s) can significantly influence the phenotypic presentation.
  • Researchers using animal models should consider the composition of animal chow when modelling a disease with considerable phenotypic variability.

Keywords: mouse model; CRISPR; exome sequencing; SNP; genetic background

Figure 1. Variables that can impact success in identifying disease‐associated variants. (a) Overview of the methodology to generate a list of possible disease‐associated gene variants. Many inputs of data and knowledge, as well as assumptions, help create and prioritise a ‘manageable’ list of variants to be considered. None of these data or knowledge inputs is proof of causation of any given variant but, collectively, they provide evidence to support a role of the selected gene. Demonstration of a functional impact of the actual variant, and that this impact on function gives rise to the condition, is necessary. RVIS, residual variation intolerance score; GERP, Genomic Evolutionary Rate Profiling; CADD, Combined Annotation‐Dependent Depletion. (b) The detail and thoroughness of the clinical phenotyping (i.e. the strictness of the criteria and the objectivity of the phenotyping) can often be inversely correlated with the level of genetic heterogeneity expected in the cohort. (c) Schematic representation of the volume of exome variants ‘lost’ in the process of exome sequencing. When no clear candidate gene is found in exome studies, recheck your data (BAM files). The technical limitations of many exome sequencing platforms and the bioinformatics thresholds for data cleanup may be responsible: as many as 10–20% of actual variants (red asterisk) are excluded by current procedures.
Figure 2. Considerations for the functional validation of potential disease‐associated variants using mouse models. (a) Multiple factors should be considered when seeking to functionally validate variants using the mouse as the model system. All these factors can influence the phenotypic presentation of disease in mice and thus the interpretation of the variant being causal. (b) If creating a new mouse model, a number of genetic modification strategies should be considered. The choice will depend on many factors, including the predicted functional impact of the variant to be assessed, the tissue distribution of expression and the mode of inheritance expected. Each approach has pros and cons of which investigators should be aware and control for as necessary when assessing phenotypes. HR, homologous recombination; CRISPR, clustered regularly interspaced short palindromic repeats; Cas, CRISPR‐associated; TALEN, transcription‐activator‐like effector nuclease; KO, knockout; SNP, single nucleotide polymorphism.


Ashe A, Morgan DK, Whitelaw NC, et al. (2008) A genome‐wide screen for modifiers of transgene variegation identifies genes with critical roles in development. Genome Biology 9 (12): R182.

Bielen H (2015) Zebrafish as an Experimental Organism (In: eLS). John Wiley & Sons, Ltd: Chichester. DOI: 10.1002/9780470015902.a0002094.pub2.

Carey JC, Allanson JE, Hennekam RC and Biesecker LG (2012) Standard terminology for phenotypic variations: the elements of morphology project, its current progress, and future directions. Human Mutation 33: 781–786.

Cherukuri PF, Maduro V, Fuentes‐Fajardo KV, et al. (2015) Replicate exome‐sequencing in a multiple‐generation family: improved interpretation of next‐generation sequencing data. BMC Genomics 16: 998.

Chu EY, Tamasas B, Fong H, et al. (2016) Full spectrum of postnatal tooth phenotypes in a novel Irf6 cleft lip model. Journal of Dental Research. Jul 1. pii: 0022034516656787 [Epub ahead of print].

Cox TC, Luquetti DV and Cunningham ML (2013) Perspectives and challenges in advancing research into craniofacial anomalies. American Journal of Medical Genetics. Part C, Seminars in Medical Genetics 163C (4): 213–217.

Dixon J and Dixon MJ (2004) Genetic background has a major effect on the penetrance and severity of craniofacial defects in mice heterozygous for the gene encoding the nucleolar protein Treacle. Developmental Dynamics 229 (4): 907–914.

Eagleson KL, Schlueter McFadyen‐Ketchum LJ, Ahrens ET, et al. (2007) Disruption of Foxg1 expression by knock‐in of cre recombinase: effects on the development of the mouse telencephalon. Neuroscience 148 (2): 385–399.

Feil S, Valtcheva N and Feil R (2009) Inducible Cre mice. Methods in Molecular Biology 530: 343–363.

Goldfeder RL, Priest JR, Zook JM, et al. (2016) Medical implications of technical accuracy in genome sequencing. Genome Medicine 8 (1): 24.

Halbritter J, Baum M, Hynes AM, et al. (2014) Fourteen monogenic genes account for 15% of nephrolithiasis/nephrocalcinosis. Journal of the American Society of Nephrology 26: 543–551.

Hébert JM and McConnell SK (2000) Targeting of cre to the Foxg1 (BF‐1) locus mediates loxP recombination in the telencephalon and other developing head structures. Developmental Biology 222 (2): 296–306.

Huh WJ, Khurana SS, Geahlen JH, et al. (2012) Tamoxifen induces rapid, reversible atrophy, and metaplasia in mouse stomach. Gastroenterology 142 (1): 21–24.e7.

Hsu PD, Lander ES and Zhang F (2014) Development and applications of CRISPR‐Cas9 for genome engineering. Cell 157 (6): 1262–1278.

Jugessur A, Shi M, Gjessing HK, et al. (2011) Fetal genetic risk of isolated cleft lip only versus isolated cleft lip and palate: a subphenotype analysis using two population‐based studies of orofacial clefts in scandinavia. Birth Defects Research. Part A, Clinical and Molecular Teratology 91 (2): 85–92.

Lewis AE, Vasudevan HN, O'Neill AK, Soriano P and Bush JO (2013) The widely used Wnt1‐Cre transgene causes developmental phenotypes by ectopic activation of Wnt signaling. Developmental Biology 379: 229–234.

Li X, Venugopalan SR, Cao H, et al. (2014) A model for the molecular underpinnings of tooth defects in Axenfeld‐Rieger syndrome. Human Molecular Genetics 23 (1): 194–208.

Liu W, Selever J, Lu MF and Martin JF (2003) Genetic dissection of Pitx2 in craniofacial development uncovers new functions in branchial arch morphogenesis, late aspects of tooth morphogenesis and cell migration. Development 130 (25): 6375–6385.

Mancini F, Romani M, Micalizzi A and Valente EM (2014) Molecular Genetics of Joubert Syndrome (In: eLS). John Wiley & Sons, Ltd: Chichester. DOI: 10.1002/9780470015902.a0024288.

Martinkovich S, Shah D, Planey SL and Arnott JA (2014) Selective estrogen receptor modulators: tissue specificity and clinical utility. Clinical Interventions in Aging 9: 1437–1452.

M'hamdi O, Ouertani I and Chaabouni‐Bouhamed H (2014) Update on the genetics of Bardet‐Biedl syndrome. Molecular Syndromology 5 (2): 51–56.

Ng SB, Bigham AW, Buckingham KJ, et al. (2010) Exome sequencing identifies MLL2 mutations as a cause of Kabuki syndrome. Nature Genetics 42 (9): 790–793.

Park CY, Sung JJ and Kim DW (2016) Genome editing of structural variations: modeling and gene correction. Trends in Biotechnology 34 (7): 548–561.

Purushothaman R, Cox TC, Maga AM and Cunningham ML (2011) Facial suture synostosis of newborn Fgfr1(P250R/+) and Fgfr2(S252W/+) mouse models of Pfeiffer and Apert syndromes. Birth Defects Research. Part A, Clinical and Molecular Teratology 91 (7): 603–609.

Roseboom T and Painter R (2014) Transgenerational Impact of Nutrition on Disease Risk (In: eLS). John Wiley & Sons, Ltd: Chichester. DOI: 10.1002/9780470015902.a0025445.

Shehata M, van Amerongen R, Zeeman AL, Giraddi RR and Stingl J (2014) The influence of tamoxifen on normal mouse mammary gland homeostasis. Breast Cancer Research 16 (4): 411.

Solomon BD, Gropman A and Muenke M (2013) Holoprosencephaly overview. In: Pagon RA, Bird TD, Dolan CR, Stephens K and Adam MP (eds) GeneReviews™ [Internet]. Seattle, WA: University of Washington.

Sternberg SH and Doudna JA (2015) Expanding the biologist's toolkit with CRISPR‐Cas9. Molecular Cell 58 (4): 568–574.

Tassabehji M, Hammond P, Karmiloff‐Smith A, et al. (2005) GTF2IRD1 in craniofacial development of humans and mice. Science 310 (5751): 1184–1187.

Wang Y, Xiao R, Yang F, et al. (2005) Abnormalities in cartilage and bone development in the Apert syndrome FGFR2(+/S252W) mouse. Development 132 (15): 3537–3548.

Further Reading

Campino S, Forton J, Raj S, et al. (2008) Validating discovered Cis‐acting regulatory genetic variants: application of an allele specific expression approach to HapMap populations. PLoS One 3 (12): e4105.

Huang Q, Whitington T, Gao P, et al. (2014) A prostate cancer susceptibility allele at 6q22 increases RFX6 expression by modulating HOXB13 chromatin binding. Nature Genetics 46 (2): 126–135.

Lanthaler B, Wieser S, Deutschmann A, et al. (2014) Genotype‐based databases for variants causing rare diseases. Gene 550 (1): 136–140.

Prokunina L and Alarcón‐Riquelme ME (2004) Regulatory SNPs in complex diseases: their identification and functional validation. Expert Reviews in Molecular Medicine 6 (10): 1–15.

Richards S, Aziz N, Bale S, et al. (2015) Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genetics in Medicine 17 (5): 405–424.

Spies N, Zook JM, Salit M and Sidow A (2015) svviz: a read viewer for validating structural variants. Bioinformatics 31 (24): 3994–3996.

Tucker T, Montpetit A, Chai D, et al. (2011) Comparison of genome‐wide array genomic hybridization platforms for the detection of copy number variants in idiopathic mental retardation. BMC Medical Genomics 4: 25.

Vallania FL, Druley TE, Ramos E, et al. (2010) High‐throughput discovery of rare insertions and deletions in large cohorts. Genome Research 20 (12): 1711–1718.

Vona B, Lechno S, Hofrichter MA, et al. (2016) Confirmation of PDZD7 as a nonsyndromic hearing loss gene. Ear and Hearing. 37 (4): e238–246.

Contact Editor close
Submit a note to the editor about this article by filling in the form below.

* Required Field

How to Cite close
Cox, Timothy C, and Cox, Liza L(Aug 2016) Assessment of Disease‐Associated Sequence Variants and Considerations for Functional Validation using Mouse Models. In: eLS. John Wiley & Sons Ltd, Chichester. [doi: 10.1002/9780470015902.a0026656]