Computational Approach to Annotating Variants of Unknown Significance in Clinical Next Generation Sequencing

Review Computational Approach to Annotating Variants of Unknown Significance in Clinical Next Generation Sequencing Wade L. Schulz, MD, PhD,1* Christ...
Author: Paul Woods
3 downloads 0 Views 411KB Size
Review

Computational Approach to Annotating Variants of Unknown Significance in Clinical Next Generation Sequencing Wade L. Schulz, MD, PhD,1* Christopher A. Tormey, MD,1,2 Richard Torres, MD1 Lab Med Fall 2015;46:285-289 DOI: 10.1309/LMWZH57BRWOPR5RQ

Next generation sequencing (NGS) has become a common technology in the clinical laboratory, particularly for the analysis of malignant neoplasms. However, most mutations identified by NGS are variants of unknown clinical significance (VOUS). Although the approach to define these variants differs by institution, software algorithms that predict variant effect on protein function may be used. However,

these algorithms commonly generate conflicting results, potentially adding uncertainty to interpretation. In this review, we examine several computational tools used to predict whether a variant has clinical significance. In addition to describing the role of these tools in clinical diagnostics, we assess their efficacy in analyzing known pathogenic and benign variants in hematologic malignancies.

Next generation sequencing (NGS) has gained significant traction in the clinical laboratory. Although many institutions can now sequence large regions of the genome, the ability to provide a clinical interpretation is a far more daunting task. It is predicted that every individual has approximately 10,000 genomic variants that result in an amino acid change and approximately 450 insertions and deletions compared with the human reference genome.1 Although some variants can be easily categorized as known pathogenic or benign mutations, only a few variants have been sufficiently characterized as such. Far more often, the healthcare professional is presented with a variant of unknown significance (VOUS),

typically with no clinical or experimental data to aid in interpretation. The only alternatives for a healthcare provider are to evaluate significance based on available sequence, structural, and biologic data from disparate databases and to apply one of several computational tools that predict the biologic effect of novel variants.2-4

Abbreviations NGS, next generation sequencing; VOUS, variant of unknown significance; SIFT, Sorting Intolerant from Tolerant; PANTHER, Protein Analysis through Evolutionary Relationships; dbSNP, Single Nucleotide Polymorphism Database; CF, cystic fibrosis; DNMT3A, DNA (cytosine-5)methyltransferase 3A; FLT3, Fms-like tyrosine kinase 3; IDH1, isocitrate dehydrogenase 1; IDH2, isocitrate dehydrogenase 2; KIT, v-kit HardyZuckerman 4 feline sarcoma viral oncogene homolog; TP53, tumor protein p53 gene Department of Laboratory Medicine, Yale University School of Medicine and 2Pathology and Laboratory Medicine Service, VA Connecticut Healthcare System, West Haven, CT 1

*To whom correspondence should be addressed. [email protected]

www.labmedicine.com

Early algorithms used to predict the effect of missense variants focused primarily on the similarity between amino acid properties such as charge and size.5 Although these predictions are easy to generate, their usefulness in predicting clinically meaningful phenotypes is limited.6 With advances in computing, several algorithms were developed that account for the evolutionary conservation of a sequence or the effect of a variant on predicted protein structure.6-8 The goal of such algorithms was to provide more accurate predictions of how novel variants affect protein function. Although it has been demonstrated that these tools have a higher degree of biochemical accuracy, their effectiveness in predicting clinical relevance is more limited.4,6,9,10 Because many sequence analysis packages include these predictions as part of the variant annotation pipeline, understanding how predictions are generated is important for their use in clinical interpretation. In this review, we present some of the most commonly used algorithms (Table 1) and their potential limitations for clinical variant interpretation, particularly within hematopathology.

Fall 2015  |  Volume 46, Number 4  Lab Medicine 

285

Downloaded from http://labmed.oxfordjournals.org/ by guest on January 21, 2017

ABSTRACT

Review

Table 1. Summary of Algorithms Commonly Used for Variant Effect Prediction Algorithm/Application

Strengths

Limitations

Grantham Score

Easily calculated, based on amino acid chemical composition Does not require any precomputed protein information (structure, domain) Integrates protein homology into prediction Strong negative predictive value Prediction based on evolutionary conservation Shown to have most accurate prediction model for human disease in several studies Prediction based on evolutionary conservation Integrates structural information when available Uses machine learning algorithm based on human protein data sets

Does not account for position Protein structure does not affect prediction

SIFT PANTHER

PolyPhen-2

Poor positive predictive value Protein structure does not directly impact prediction Requires a large sequence database with significant curation Choice of training proteins can drastically alter prediction Requires precomputation and alignment of sequence/ structural data

Algorithms Grantham Score One of the first algorithms developed to predict variant effect was the Grantham score.5 This prediction, based on physicochemical similarity between amino acids, produces a score from 0 to 215, with lower scores indicating a more similar substitution. The concept of this algorithm is that substitutions with chemically similar amino acids—those with similar composition, polarity, and molecular volume—are more likely to be tolerated.5 This approach does not account for structural conformation or location within the peptide sequence. For example, a change from isoleucine to leucine anywhere within the sequence will generate the same score, even if some locations within the protein will logically be less tolerant to change than others (ie, enzymatic, binding, and other important structural domains).7,8 Because of the limited clinical usefulness of this approach, algorithms that also take evolutionary conservation of the sequence into account have gained popularity.1,3,11

Sorting Intolerant from Tolerant (SIFT) and Protein Analysis through Evolutionary Relationships (PANTHER) Evolutionary conservation refers to the sequence homology among members of the same protein family (ie, proteins with similar sequence and structure in different organisms). Several studies7,12,14 have demonstrated that mutations at evolutionarily conserved sites are more likely to be disease-related. On this basis, the SIFT application

286  Lab Medicine 

Fall 2015  |  Volume 46, Number 4

was developed to predict the likelihood that a variant is damaging based on sequence conservation (http://sift. jcvi.org/).7 The SIFT algorithm generates scores ranging from 0 to 1, with scores of less than 0.05 being suggestive of a damaging mutation. When applied to mutations annotated in the Swiss-Prot group database and the Single Nucleotide Polymorphism Database (dbSNP) as causing disease, SIFT has a sensitivity of approximately 69% and a false-positive rate of approximately 20%.7 Although these rates of accuracy are similar to those of other tools, this finding highlights the number of incorrect predictions that are possible, even with a small number of variants. A second algorithm, Protein Analysis through Evolutionary Relationships (PANTHER), also generates predictions based on evolutionary conservation (http://www.pantherdb. org/). However, PANTHER also accounts for whether the amino acid position is known to be essential for function.8,13 PANTHER scores range from -12 to 0, with more negative scores having an increased probability of being damaging. The PANTHER score correlates to the probability that a mutation will be damaging, with a score of -3 indicating a 50% chance that a variant is deleterious.8 In a direct comparison study, SIFT and PANTHER were found to have significant limitations in their ability to predict human disease in patients with cystic fibrosis (CF).9 The comparison showed that PANTHER outperforms SIFT in distinguishing mutations that are benign and those that lead to symptomatic CF; however, the positive predictive value was less than 65% for either algorithm.9

PolyPhen-2 Despite the fact that PANTHER and SIFT can generate a prediction for proteins without a known or predicted

www.labmedicine.com

Downloaded from http://labmed.oxfordjournals.org/ by guest on January 21, 2017

SIFT indicates Sorting Intolerant from Tolerant; PANTHER, Protein Analysis through Evolutionary Relationships.

Review

Table 2. Pathogenic Mutations and Known Polymorphisms Used to Assess Algorithm Performance Variable

Protein

Transcript

Variants

PMID/dbSNP ID

Pathogenic

DNMT3A FLT3

NM_175629.2 NM_004119.2

22160010, 22898539 23261068

IDH1 IDH2

NM_005896.3 NM_002168.3

KIT FLT3

NM_000222.2 NM_004119.2

p.882R>C, p.882R>G, p.882R>S, p.882R>H, p.882R>L, p.882R>P p.835D>H, p.835D>N, p.835D>Y, p.835D>A, p.835D>V, p.835D>E, p.836I>F, p.836I>L, p.836I>V, p.836I>M p.132R>C, p.132R>G, p.132R>S, p.132R>H, p.132R>L, p.132R>P p.140R>G, p.140R>W, p.140R>L, p.140R>Q, p.140R>K, p.140R>M, p.140R>S p.816D>H, p.816D>Y, p.816D>V p.7D>G, p.178V>I, p.227T>M, p.557V>I

IDH1 KIT TP53

NM_005896.3 p.178V>I NM_000222.2 p.541M>V, p.541M>L NM_001276760.1 p.33P>R

Polymorphism

22160010, 22898539 22160010, 22898539

secondary structure, one might expect the use of structural information to improve the accuracy of such algorithms.2,14,15 One of the most commonly used tools that accounts for protein structure is PolyPhen-2 (http://genetics.bwh. harvard.edu/pph2/). PolyPhen-2 uses sequence alignment and structural predictions, when available, and generates a final prediction based on an underlying machine-learning algorithm.15 Scores for this utility range from 0 to 1, with higher scores more likely to be damaging. A qualitative result is also assigned, with the thresholds for “probably damaging,” “possibly damaging,” and “benign” varying by version and training set.15 Two training sets exist: HumDiv, which is used to evaluate variants involved in complex traits, and HumVar, used to evaluate Mendelian variants. Although additional training sets could be developed, neither of the existing sources contains cancer-specific mutations. Two studies1,15 have shown that training set selection in structurally-sensitive predictive tools such as PolyPhen-2 can significantly affect variant predictions.1,15 Notably, the comparison study that assessed the ability of SIFT and PANTHER to predict disease-causing mutations in CF found that PolyPhen-2 had worse predictive performance when used to assess these variants.9 Some of the algorithms mentioned previously, such as PANTHER, have shown promise for population-based genomics. In particular, they have shown satisfactory

Results We compared the efficacy of these algorithms in predicting cancer-related mutations by examining several variants known to be associated with acute myeloid leukemia, including mutations in the DNA (cytosine-5)methyltransferase 3A (DNMT3A), Fms-like tyrosine kinase 3 (FLT3), isocitrate dehydrogenase 1 (IDH1), isocitrate dehydrogenase 2 (IDH2), and v-kit Hardy-Zuckerman 4 feline sarcoma viral oncogene homolog (KIT) genes (Table 2).16-20 Several known benign polymorphisms in FLT3, IDH1, KIT, and tumor protein p53 (TP53) were also tested (Table 2). Similar to the findings reported by other studies, each of the tools had significant variation in the predicted scores for known pathogenic and known benign mutations (Figure 1). These results demonstrate that computational data by themselves are likely to misclassify novel variants. In this data set, PANTHER shows the most practical potential, with a high positive predictive value. Despite only moderate sensitivity, values higher than 0.3 correspond to clinically significant abnormalities. However, larger studies that include outcomes from targeted therapeutics in variants of unknown significance are needed to fully evaluate the usefulness of these tools in cancer variant prediction.

performance in screening CF mutations for clinical impact.9 However, their usefulness in predicting the effect of cancer-related mutations likely is more limited. Even detrimental mutations, such as nonsense and frameshift mutations, may be passenger mutations that have no effect in malignant cells, although they would likely be deleterious to normal cells.

www.labmedicine.com

Discussion The use of clinical genomics to diagnose and predict the outcome of disease is constantly playing an increasingly integral role in laboratory diagnostics.

Fall 2015  |  Volume 46, Number 4  Lab Medicine 

287

Downloaded from http://labmed.oxfordjournals.org/ by guest on January 21, 2017

19164557, 23714533 rs12872889, rs34218846, rs19333437, rs35958982 rs34218846 rs3822214 rs1042522

Review

Grantham

Score

150 100 50 0

Known Benign

Known Pathogenic SIFT

0.2 0.0

Probability Deleterious

Known Pathogenic

Known Benign

PANTHER Scores 1.0 0.6 0.2 Known Pathogenic

Known Benign

Score

PolyPhen2 – HumDiv 0.8 0.4 0.0

Known Pathogenic

Known Benign

PolyPhen2 – HumVar

Score

1.0 0.6 0.2 Known Pathogenic

Known Benign

Figure 1 Comparison of Grantham, Sorting Intolerant from Tolerant (SIFT), Protein Analysis through Evolutionary Relationships (PANTHER), and PolyPhen-2 predictions for known pathogenic and benign variants. PANTHER scores are displayed as the probability that a variant is deleterious. Larger values for each predictive algorithm, except for SIFT, indicate a higher likelihood that a variant is deleterious. For SIFT predictions, values of less than 0.05, indicated by the line, are likely deleterious.

288  Lab Medicine 

Well-controlled studies to define every novel variant will not be possible in the short term. Hence, computational approaches will continue to play an important role in the evaluation of these mutations. Based on our clinical experience and published studies, PANTHER and other tools show potential for the screening of populationbased or genome-wide variants.1,9,13 However, significant limitations remain in their use for evaluating variants in individual patients. Nevertheless, understanding the origin and performance of the various computational approaches available can be helpful in the interpretation of the large number of novel variants that are identified with next generation sequencing.  LM

Fall 2015  |  Volume 46, Number 4

References 1. Peterson TA, Doughty E, Kann MG. Towards precision medicine: advances in computational approaches for the analysis of human variants. J Mol Biol. 2013;425(21):4047-4063. 2. Chun S, Fay JC. Identification of deleterious mutations within three human genomes. Genome Res. 2009;19(9):1553-1561. 3. Pabinger S, Dander A, Fischer M, et al. A survey of tools for variant analysis of next-generation genome sequencing data. Brief Bioinform. 2014;15(2):256-278.

www.labmedicine.com

Downloaded from http://labmed.oxfordjournals.org/ by guest on January 21, 2017

Score

0.4

Our understanding of genetic variation has rapidly expanded, yet we still identify a significant number of variants of unknown significance. Despite the push to annotate and provide information about every variant obtained by sequencing, particular care is needed when predictive tools are used to annotate novel mutations. As reported for several variants related to cystic fibrosis, these tools have limited positive predictive value and fail to predict disease severity.9 Based on these data, none of these tools perform very well by themselves in a clinical setting. However, some studies have suggested that a combined score may improve the specificity of prediction. One such study4 has demonstrated that combining approaches such as evolutionary conservation and structural effect prediction leads to improved predictive accuracy. Using an approach that incorporates the prediction of multiple tools is reasonably expected to improve the specificity of detection of damaging variants, albeit at the cost of overall sensitivity. Despite the fact that scores from these predictive algorithms are frequently included in reports from sequencing and alignment systems, these findings highlight the need to carefully assess the usefulness of these algorithms in the clinical setting.

Review

4. Tavtigian SV, Greenblatt MS, Lesueur F, Byrnes GB, IARC Unclassified Genetic Variants Working Group. In silico analysis of missense substitutions using sequence-alignment based methods. Hum Mutat. 2008;29(11):1327-1336. 5. Grantham R. Amino acid difference formula to help explain protein evolution. Science. 1974;185(4154):862-864. 6. Marini NJ, Thomas PD, Rine J. The use of orthologous sequences to predict the impact of amino acid substitutions on protein function. PLoS Genet. 2010;6(5):e1000968. doi: 10.1371/journal.pgen.1000968. 7. Ng P, Henikoff S. SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res. 2003;31(13):3812-33814. 8. Thomas PD, Campbell MJ, Kejariwal A, et al. PANTHER: a library of protein families and subfamilies indexed by function. Genome Res. 2003;13(9):2129-2141.

Downloaded from http://labmed.oxfordjournals.org/ by guest on January 21, 2017

9. Dorfman R, Nalpathamkalam T, Taylor C, et al. Do common in silico tools predict the clinical consequences of amino-acid substitutions in the CFTR gene? Clin Genet. 2010;77(5):464-473. 10. Brunham LR, Singaraja RR, Pape TD, Kejariwal A, Thomas PD, Hayden MR. Accurate prediction of the functional significance of single nucleotide polymorphisms and mutations in the ABCA1 gene. PLoS Genet. 2005;1(6):e83. 11. Chang X, Wang K. wANNOVAR: annotating genetic variants for personal genomes via the web. J Med Genet. 2012;49(7):433-436. 12. Ng PC, Henikoff S. Predicting deleterious amino acid substitutions. Genome Res. 2001;11(5):863-874. 13. Mi H, Dong Q, Muruganujan A, Gaudet P, Lewis S, Thomas PD. PANTHER version 7: improved phylogenetic trees, orthologs and collaboration with the Gene Ontology Consortium. Nucleic Acids Res. 2010;38(Database issue):D204-210. 14. Sunyaev S, Ramensky V, Koch I, Lathe W 3rd, Kondrashov AS, Bork P. Prediction of deleterious human alleles. Hum Mol Genet. 2001;10(6):591-597. 15. Adzhubei IA, Schmidt S, Peshkin L, et al. A method and server for predicting damaging missense mutations. Nat Methods. 2010;7(4):248-249. 16. Smith CC, Shah NP. The role of kinase inhibitors in the treatment of patients with acute myeloid leukemia. Am Soc Clin Oncol Educ Book. 2013:313-318. 17. Gajiwala KS, Wu JC, Christensen J, et al. KIT kinase mutants show unique mechanisms of drug resistance to imatinib and sunitinib in gastrointestinal stromal tumor patients. Proc Natl Acad Sci U S A. 2009;106(5):1542-1547. 18. Döhner H, Gaidzik VI. Impact of genetic features on treatment decisions in AML. Hematology Am Soc Hematol Educ Program. 2011;2011:36-42. 19. Martelli MP, Sportoletti P, Tiacci E, Martelli MF, Falini B. Mutational landscape of AML with normal cytogenetics: biological and clinical implications. Blood Rev. 2013;27(1):13-22. 20. Shih AH, Abdel-Wahab O, Patel JP, Levine RL. The role of mutations in epigenetic regulators in myeloid malignancies. Nat Rev Cancer. 2012;12(9):599-612

www.labmedicine.com

Fall 2015  |  Volume 46, Number 4  Lab Medicine 

289