THOMPSON & THOMPSON

GENETICS IN MEDICINE EIGHTH EDITION

Robert L. Nussbaum, MD, FACP, FACMG Holly Smith Chair of Medicine and Science Professor of Medicine, Neurology, Pediatrics and Pathology Department of Medicine and Institute for Human Genetics University of California San Francisco San Francisco, California

Roderick R. McInnes, CM, MD, PhD, FRS(C), FCAHS, FCCMG Alva Chair in Human Genetics Canada Research Chair in Neurogenetics Professor of Human Genetics and Biochemistry Director, Lady Davis Institute Jewish General Hospital McGill University Montreal, Quebec, Canada

Huntington F. Willard, PhD President and Director The Marine Biological Laboratory Woods Hole, Massachusetts and Professor of Human Genetics University of Chicago Chicago, Illinois

With Clinical Case Studies updated by:

Ada Hamosh, MD, MPH Professor of Pediatrics McKusick-Nathans Institute of Genetic Medicine Scientific Director, OMIM Johns Hopkins University School of Medicine Baltimore, Maryland

54

THOMPSON & THOMPSON GENETICS IN MEDICINE

Factor VIII gene A

1

21

Upstream of gene

22

B

23

Mispairing and recombination

Remainder of gene

23 21

B

22

A

1 Hemophilia A mutation

Figure 4-5 Inverted homologous sequences, labeled A and B, located 500 kb apart on the X chromosome, one upstream of the factor VIII gene, the other in an intron between exons 22 and 23 of the gene. Intrachromosomal mispairing and recombination results in inversion of exons 1 through 22 of the gene, thereby disrupting the gene and causing severe hemophilia.

Inverted segment within gene A/B Upstream of gene

located in the coding portion of an exon, in an untranslated region of an exon, or even in an intron may expand during gametogenesis, in what is referred to as a dynamic mutation, and interfere with normal gene expression or protein function. An expanded repeat in the coding region will generate an abnormal protein product, whereas repeat expansion in the untranslated regions or introns of a gene may interfere with transcription, mRNA processing, or translation. How dynamic mutations occur is not completely understood; they are conceptually similar to microsatellite polymorphisms but expand at a rate much higher than typically seen for microsatellite loci. The involvement of simple nucleotide repeat expansions in disease is discussed further in Chapters 7 and 12. In disorders caused by dynamic mutations, marked parent-of-origin effects are well known and appear characteristic of the specific disease and/or the particular simple nucleotide repeat involved (see Chapter 12). Such differences may be due to fundamental biological differences between oogenesis and spermatogenesis but may also result from selection against gametes carrying certain repeat expansions.

VARIATION IN INDIVIDUAL GENOMES The most extensive current inventory of the amount and type of variation to be expected in any given genome comes from the direct analysis of individual diploid human genomes. The first of such genome sequences, that of a male individual, was reported in 2007. Now, tens of thousands of individual genomes have been sequenced, some as part of large international research consortia exploring human genetic diversity in health and disease, and others in the context of clinical

22

21

1

A/B

23 Remainder of gene

sequencing to determine the underlying basis of a disorder in particular patients. What degree of genome variation does one detect in such studies? Individual human genomes typically carry 5 to 10 million SNPs, of which—depending in part on the population—as many as a quarter to a third are novel (see Box). This suggests that the number of SNPs described for our species is still incomplete, although presumably the fraction of such novel SNPs will decrease as more and more genomes from more and more populations are sequenced. Within this variation lie variants with known, likely, or suspected clinical impact. Based on studies to date, each genome carries 50 to 100 variants that have previously been implicated in known inherited conditions. In addition, each genome carries thousands of nonsynonymous SNPs in protein-coding genes around the genome, some of which would be predicted to alter protein function. Each genome also carries approximately 200 to 300 likely loss-of-function mutations, some of which are present at both alleles of genes in that individual. Within the clinical setting, this realization has important implications for the interpretation of genome sequence data from patients, particularly when trying to predict the impact of mutations in genes of currently unknown function (see Chapter 16). An interesting and unanticipated aspect of individual genome sequencing is that the reference human genome assembly still lacks considerable amounts of undocumented and unannotated DNA that are discovered in literally every individual genome being sequenced. These “new” sequences are revealed only as additional genomes are sequenced. Thus the complete collection of all human genome sequences to be found in our current population of 7 billion individuals, estimated to be 20

CHAPTER 4 — HUMAN GENETIC DIVERSITY: MUTATION AND POLYMORPHISM

to 40 Mb larger than the extant reference assembly, still remains to be fully elucidated. As impressive as the current inventory of human genetic diversity is, it is clear that we are still in a mode of discovery; no doubt millions of additional SNPs and other variants remain to be uncovered, as does the degree to which any of them might affect an individual’s clinical status in the context of wellness and health care.

55

conditions, such as autism, schizophrenia, epilepsy, or intellectual disability and developmental delay. Clinical sequencing studies can target either germline or somatic variants. In cancer, especially, various strategies have been used to search for somatic mutations in tumor tissue to identify genes potentially relevant to cancer progression (see Chapter 15). PERSONAL GENOMICS AND THE ROLE OF THE CONSUMER

VARIATION DETECTED IN A TYPICAL HUMAN GENOME Individuals vary greatly in a wide range of biological functions, determined in part by variation among their genomes. Any individual genome will contain the following: • ≈5-10 million SNPs (varies by population) • 25,000-50,000 rare variants (private mutations or seen previously in < 0.5% of individuals tested) • ≈75 new base pair mutations not detected in parental genomes • 3-7 new CNVs involving ≈500 kb of DNA • 200,000-500,000 population)

indels

(1-50

bp)

(varies

by

• 500-1000 deletions 1-45 kb, overlapping ≈200 genes • ≈150 in-frame indels • ≈200-250 shifts in reading frame • 10,000-12,000 synonymous SNPs • 8,000-11,000 nonsynonymous SNPs in 4,000-5,000 genes • 175-500 rare nonsynonymous variants • 1 new nonsynonymous mutation • ≈100 premature stop codons • 40-50 splice site-disrupting variants • 250-300 genes with likely loss-of-function variants • ≈25 genes predicted to be completely inactivated

Clinical Sequencing Studies In the context of genomic medicine, a key question is to what extent variation in the sequence and/or expression of one’s genome influences the likelihood of disease onset, determines or signals the natural history of disease, and/or provides clues relevant to the management of disease. As just discussed, variation in one’s constitutional genome can have a number of different direct or indirect effects on gene function. Sequencing of entire genomes (so-called wholegenome sequencing) or of the subset of genomes that include all of the known coding exons (so-called wholeexome sequencing) has been introduced in a number of clinical settings, as will be discussed in greater detail in Chapter 16. Both whole-exome and whole-genome sequencing have been used to detect de novo mutations (both point mutations and CNVs) in a variety of conditions of complex and/or unknown etiology, including, for example, various neurodevelopmental or neuropsychiatric

The increasing ability to sequence individual genomes is not only enabling research and clinical laboratories, but also spawning a social and information revolution among consumers in the context of direct-to-consumer (DTC) genomics, in which testing of polymorphisms genome-wide and even sequencing of entire genomes is offered directly to potential customers, bypassing health professionals. It is still largely unclear what degree of genome surveillance will be most useful for routine clinical practice, and this is likely to evolve rapidly in the case of specific conditions, as our knowledge increases, as professional practice guidelines are adopted, and as insurance companies react. Some groups have raised substantial concerns about privacy and about the need to regulate the industry. At the same time, however, other individuals are willing to make genome sequence data (and even medical information) available more or less publicly. Attitudes in this area vary widely among professionals and the general public alike, depending on whether one views knowing the sequence of one’s genome to be a fundamentally medical or personal activity. Critics of DTC testing and policymakers, in both the health industry and government, focus on issues of clinical utility, regulatory standards, medical oversight, availability of genetic counseling, and privacy. Proponents of DTC testing and even consumers themselves, on the other hand, focus more on freedom of information, individual rights, social and personal awareness, public education, and consumer empowerment. The availability of individual genome information is increasingly a commercial commodity and a personal reality. In that sense, and notwithstanding or minimizing the significant scientific, ethical, and clinical issues that lie ahead, it is certain that individual genome sequences will be an active part of medical practice for today’s students.

IMPACT OF MUTATION AND POLYMORPHISM Although it will be self-evident to students of human genetics that new deleterious mutations or rare variants in the population may have clinical consequences, it may appear less obvious that common polymorphic variants can be medically relevant. For the proportion of polymorphic variation that occurs in the genes themselves, such loci can be studied by examining variation in the proteins encoded by the different alleles. It has long been estimated that any one individual is likely to carry two distinct alleles determining structurally differing polypeptides at approximately 20% of all

CHAPTER 10 — IDENTIFYING THE GENETIC BASIS FOR HUMAN DISEASE

critical and perhaps most fundamental finding of GWAS: the genetic architecture of some of the most common complex diseases studied to date may involve hundreds to thousands of loci harboring variants of small effect in many genes and pathways. These genes and pathways are important to our understanding of how complex diseases occur, even if each allele exerts only subtle effects on gene regulation or protein function and has only a modest effect on disease susceptibility on a per allele basis. Thus GWAS remain an important human genetics research tool for dissecting the many contributions to complex disease, regardless of whether or not the individual variants found to be associated with the disease substantially raise the risk for the disease in individuals carrying those alleles (see Chapter 16). We expect that many more genetic variants responsible for complex diseases will be successfully identified by genome-wide association and that deep sequencing of the regions showing disease associations should uncover the variants or collections of variants functionally responsible for disease associations. Such findings should provide us with powerful insights and potential therapeutic targets for many of the common diseases that cause so much morbidity and mortality in the population.

FINDING GENES RESPONSIBLE FOR DISEASE BY GENOME SEQUENCING Thus far in this chapter, we have focused on two approaches to map and then identify genes involved in disease, linkage analysis and GWAS. Now we turn to a third approach, involving direct genome sequencing of affected individuals and their parents and/or other individuals in the family or population. The development of vastly improved methods of DNA sequencing, which has cut the cost of sequencing six orders of magnitude from what was spent generating the Human Genome Project’s reference sequence, has opened up new possibilities for discovering the genes and mutations responsible for disease, particularly in the case of rare mendelian disorders. As introduced in Chapter 4, these new technologies make it possible to generate a whole-genome sequence (WGS) or, in what may be a cost-effective compromise, the sequence of only the approximately 2% of the genome containing the exons of genes, referred to as a whole-exome sequence (WES).

Filtering Whole-Genome Sequence or Whole-Exome Sequence Data to Find Potential Causative Variants As an example of what is now possible, consider a family “trio” consisting of a child affected with a rare

189

FROM GWAS TO PHEWAS In genome-wide association studies (GWAS), one explores the genetic basis for a given phenotype, disease, or trait by searching for associations with large, unbiased collections of DNA markers from the entire genome. But can one do the reverse? Can one uncover the potential phenotypic links associated with genome variants by searching for associations with large, unbiased collections of phenotypes from the entire “phenome?” Thus far, the results of this approach appear to be highly promising. In an approach dubbed phenome-wide association studies (PheWAS), genetic variants are tested for association, not just with a particular phenotype of interest (say, rheumatoid arthritis or systolic blood pressure above 160 mm Hg), but with all medically relevant phenotypes and laboratory values found in electronic medical records (EMRs). In this way, one can seek novel and unanticipated associations in an unbiased manner, using search algorithms, billing codes, and open text mining to query all electronic entries, which are fast becoming available for health records in many countries. As an illustration of this approach, SNPs for a major class II HLA-DRB1 haplotype (as described in Chapter 8) were screened against over 4800 phenotypes in EMRs from over 4000 patients; this PheWAS detected association not only with multiple sclerosis (as expected from previous studies), but also with alcohol-induced cirrhosis of the liver, erythematous conditions such as rosacea, various benign neoplasms, and several dozen other phenotypes. Although the potential of PheWAS is just being realized, such unbiased interrogation of vast clinical data sets may allow discovery of previously unappreciated comorbidities and/or less common side effects or drug-drug interactions in patients receiving prescribed drugs.

disorder and his parents. WGS is performed for all three, yielding typically over 4 million differences compared to the human genome reference sequence (see Chapter 4). Which of these variants is responsible for the disease? Extracting useful information from this massive amount of data relies on creating a variant filtering scheme based on a variety of reasonable assumptions about which variants are more likely to be responsible for the disease. One example of a filtering scheme that can be used to sort through these variants is shown in Figure 10-12. 1. Location with respect to protein-coding genes. Keep variants that are within or near exons of proteincoding genes, and discard variants deep within introns or intergenic regions. It is possible, of course, that the responsible mutation might lie in a noncoding RNA gene or in regulatory sequences located some distance from a gene, as introduced in Chapter 3. However, these are currently more difficult to assess, and thus, as a simplifying assumption, it is reasonable to focus initially on protein-coding genes.

190

THOMPSON & THOMPSON GENETICS IN MEDICINE

4,000,000 variants Not located within or near an exon ~ 80,000 variants Too frequent in public databases ~ 1,500 variants Synonymous change with no effect on mRNA splicing ~ 200 variants

Fits AR inheritance pattern

Which genes make biological sense?

~ 4 variants

Fits new mutation model

~ 2 variants

Same gene mutated in other affected indivduals

Figure 10-12 Representative filtering scheme for reducing the millions of variants detected in whole-genome sequencing of a family consisting of two unaffected parents and an affected child to a small number that can be assessed for biological and disease relevance. The initial enormous collection of variants is winnowed down into smaller and smaller bins by applying filters that remove variants that are unlikely to be causative based on assuming that variants of interest are likely to be located near a gene, will disrupt its function, and are rare. Each remaining candidate gene is then assessed for whether the variants in that gene are inherited in a manner that fits the most likely inheritance pattern of the disease, whether a variant occurs in a candidate gene that makes biological sense given the phenotype in the affected child, and whether other affected individuals also have mutations in that gene. AR, Autosomal recessive; mRNA, messenger RNA.

2. Population frequency. Keep rare variants from step 1, and discard common variants with allele frequencies greater than 0.05 (or some other arbitrary number between 0.01 and 0.1), because common variants are highly unlikely to be responsible for a disease whose population prevalence is much less than the q2 predicted by Hardy-Weinberg equilibrium (see Chapter 9). 3. Deleterious nature of the mutation. Keep variants from step 2 that cause nonsense or nonsynonymous changes in codons within exons, cause frameshift

mutations, or alter highly conserved splice sites, and discard synonymous changes that have no predicted effect on gene function. 4. Consistency with likely inheritance pattern. If the disorder is considered most likely to be autosomal recessive, keep any variants from step 3 that are found in both copies of a gene in an affected child. The child need not be homozygous for the same deleterious variant but could be a compound heterozygote for two different deleterious mutations in the same gene (see Chapter 7). If the hypothesized mode of inheritance is correct, then the parents should both be heterozygous for the variants. If there were consanguinity in the parents, the candidate genes and variants might be further filtered by requiring that the child be a true homozygote for the same mutation derived from a single common ancestor (see Chapter 9). If the disorder is severe and seems more likely to be a new dominant mutation, because unaffected parents rarely if ever have more than one affected child, keep variants from step 3 that are de novo changes in the child and are not present in either parent. In the end, millions of variants can be filtered down to a handful occurring in a small number of genes. Once the filtering reduces the number of genes and alleles to a manageable number, they can be assessed for other characteristics. First, do any of the genes have a known function or tissue expression pattern that would be expected if it were the potential disease gene? Is the gene involved in other disease phenotypes, or does it have a role in pathways with other genes in which mutations can cause similar or different phenotypes? Finally, is this same gene mutated in other patients with the disease? Finding mutations in one of these genes in other patients would then confirm this was the responsible gene in the original trio. In some cases, one gene from the list in step 4 may rise to the top as a candidate because its involvement makes biological or genetic sense or it is known to be mutated in other affected individuals. In other cases, however, the gene responsible may turn out to be entirely unanticipated on biological grounds or may not be mutated in other affected individuals because of locus heterogeneity (i.e., mutations in other as yet undiscovered genes can cause a similar disease). Such variant assessments require extensive use of public genomic databases and software tools. These include the human genome reference sequence, databases of allele frequencies, software that assesses how deleterious an amino acid substitution might be to gene function, collections of known disease-causing mutations, and databases of functional networks and biological pathways. The enormous expansion of this information over the past few years has played a crucial role in facilitating gene discovery of rare mendelian disorders.

CHAPTER 16 — RISK ASSESSMENT AND GENETIC COUNSELING

large deletion involving one or more exons, attempts to sequence PCR products made from primers that fall into the deleted region is highly problematic. The sequencing will simply fail if the deletion is in an X-linked gene in a male or, even worse, can be misleading because it will yield only the sequence from the other copy of the gene on the homologous autosome. Duplications are even more challenging because they may yield a perfectly normal sequence unless the primers used for amplification happen to straddle the junction of a duplicated segment. For deletions and duplications, a variety of other methods are available that detect deletions or duplications by providing a quantitative measure of the copy number of the deleted or duplicated region. For most genetic conditions, the majority of pathogenic mutations are single nucleotide or small insertion/ deletion mutations that are well detected by sequencing. One major exception is DMD, in which point mutations or small insertions or deletions account for only approximately 34% of mutations, whereas large deletions and insertions account for 60% and 6%, respectively, of the mutations in patients with DMD. In a patient with DMD, one might start with measuring the copy number of segments of DNA across the entire gene to look for deletion or duplication and, if normal, consider sequencing.

Gene Panels and “Clinical Whole Exomes” For many hereditary disorders (including hereditary retinal degeneration, deafness, hereditary breast and ovarian cancer, congenital myopathy, mitochondrial disorders, familial thoracic aortic aneurysm syndrome, and hypertrophic or dilated cardiomyopathies), there is substantial locus heterogeneity, that is, a large number of genes are known to be mutated in different families with these disorders. When faced with an individual patient with one of these highly heterogeneous disorders in whom the particular gene and mutations responsible for the disorder are not known, recent advances in DNA sequencing make it possible to analyze large panels of dozens to well over 100 genes simultaneously and cost-effectively for mutations in every gene in which mutations have been seen previously to cause the disorder. In disorders for which even a large panel of relevant genes cannot be formulated for a particular phenotypically defined disorder, diagnosis might still be possible by analyzing the coding exons of every gene (i.e., by whole-exome sequencing) or by sequencing the entire genome in a search for disease-causing mutations (see Chapter 4). For example, two reported series of so-called clinical whole exome testing, one from the United States and one from Canada, showed substantial success. In a 2013 study from the United States, 250 patients with

345

primarily undiagnosed neurological disorders underwent whole-exome sequencing and 62 (≈25%) received a diagnosis. Interestingly, among the patients receiving a diagnosis, four were likely to have had two disorders at the same time, which made a clinical diagnosis very difficult because the patients’ phenotype did not match any single known disorder. In another study in 2014 by the Canadian FORGE Consortium, approximately 1300 patients representing 264 disorders known or suspected of being hereditary, but for which the genes involved were unknown, underwent whole-exome sequencing. Mutations highly likely to explain the disorders were found in 60%; at least half of the genes had not been previously known to be involved in human disease. Of great interest in both studies was that a large number of patients carried de novo disease-causing mutations in genes not previously suspected of causing disorders. These mutations, because they are de novo, are extremely difficult to find by standard gene discovery methods as described in Chapter 10, such as linkage or association, and therefore pose particular challenges for genetic counseling and risk assessment.

Variant Interpretation and “Variants of Unknown Significance” The use of large gene panels and, even more so, wholeexome or whole-genome sequencing raises special issues for sequence interpretation and risk assessment. As the number of genes being studied increases, the number of differences between an individual’s sequence and that of an arbitrary reference sequence also increases; consequently many previously undescribed variants will be found whose pathogenetic significance is unknown. These are referred to as “variants of unknown significance” (VUSs). This is particularly the case for missense mutations that result in the substitution of one amino acid for another in the encoded protein. Interpreting variants is a challenging and demanding area for all professional geneticists engaged in providing molecular diagnostic services. The American College of Medical Genetics and Genomics has recommended that variants be assigned into one of five categories, ranging from definitely pathogenic to definitely benign (see Box). Only those variants with a high probability of being disease-causing are communicated to the medical provider and patient. It is a matter of debate whether a record of all VUSs should be retained by the testing laboratory and attached to a patient’s record, thereby remaining available for updating as new information becomes available to allow reclassification as either benign or pathogenic. Thus risk assessment and genetic counseling in this context are ongoing and iterative processes, continually evaluating newly available information and communicating this to medical providers and patients as appropriate.

346

THOMPSON & THOMPSON GENETICS IN MEDICINE

ASSESSING THE CLINICAL SIGNIFICANCE OF A GENE VARIANT • The American College of Genetics and Genomics recommends that all variants detected during gene sequencing (whether from targeted, whole-exome, or whole-genome sequencing) be classified on a five-level scale, spanning pathogenic, likely pathogenic, of uncertain significance, likely benign, and benign variants. Specialists in molecular diagnostics, human genomics, and bioinformatics have developed a series of criteria for assessing where a mutation sits among these five categories. In the vast majority of cases, none of these criteria is absolutely definitive but must be considered together to provide an overall assessment of how likely any variant is to be pathogenic. These criteria include the following: • Population frequency—If a variant has been seen frequently in a sizeable fraction of normal individuals (>2% of the population), it is considered less likely to be disease causing. Being frequent is, however, no guarantee a variant is benign because autosomal recessive conditions or disorders with low penetrance may be due to a disease-causing variant that may be surprisingly common among unaffected individuals because most carriers will be asymptomatic. Conversely, the vast majority of variants (>98%) found when sequencing a large gene panel or in a whole-exome or whole-genome sequence are rare (occur in 1% of the population or less), so being rare is no guarantee it is disease causing! • In silico assessment—There are many software tools designed to evaluate how likely a missense variant is to be pathogenic by determining if the amino acid at that position is highly conserved or not in orthologous proteins in other species and how likely it is that a particular amino acid substitution would be tolerated. Such tools are less than precise and are generally never used by themselves for categorizing variants for clinical use. They are, however, improving with time and are playing a role in variant assessment. A comparable set of bioinformatic tools is being developed to assess the pathogenicity of other types of variants, such as potential splice site variants or even noncoding sequence variants. • Functional data—If a particular variant has been shown to affect in vitro biochemical activity, a function in cultured cells, or the health of a model organism, then it is less likely to be benign. However, it remains possible that a particular variant will appear benign by these criteria and still be disease-causing in humans because of a prolonged human life span, environmental triggers, or

Another important aspect of how to use molecular and genome-based diagnostic testing in families is the selection of the best person(s) to test. If the consultand is also the affected proband, then molecular testing is appropriate. If, however, the consultand is an unaffected, at-risk individual, with an affected relative serving as the indication for having genetic counseling, it is best to test the affected person rather than the consultand, if logistically possible. This is because a negative mutation test in the consultand is a so-called







compensatory genes in the model organism not present in humans. Segregation data—If a particular variant has been seen to be coinherited with a disease in one or more families, or, conversely, does not track with a disease in the family under investigation, then it is more or less likely to be pathogenic. Of course, when only a few individuals are affected, the variant and disease may appear to track by random chance; the number of times a variant and disease must be coinherited to be considered not by chance alone is not firmly fixed but is generally accepted to be at least 5, if not 10. Finding affected individuals in the family who do not carry the variant would be strong evidence against the variant being pathogenic, but finding unaffected individuals who do carry the variant is less persuasive if the disorder is known to have reduced penetrance. De novo mutation—The appearance of a severe disorder in a child along with a new mutation in a coding exon that neither parent carries (de novo mutation) is additional evidence the variant is likely to be pathogenic. However, between 1 and 2 new mutations occur in the coding regions of genes in every child (see Chapter 4), and so the fact that a mutation is de novo is not definitive for the mutation being pathogenic. Variant characterization—A variant may be a synonymous change, a missense mutation, a nonsense mutation, a frameshift with a premature termination downstream, or a highly conserved splice site mutation. The impact on the function of the gene can be inferred but, once again, is not definitive. For example, a synonymous change that does not change an amino acid codon might be thought to be benign but may have deleterious effects on normal splicing and be pathogenic (see examples in Chapter 12). Conversely, premature termination or frameshift mutations might be considered to be always deleterious and disease causing. However, such mutations occurring at the far 3′ end of a gene may result in a truncated protein that is still quite capable of functioning and therefore be a benign change. Prior occurrence—A variant that has been seen before multiple times in affected patients, as recorded in collections of variants found in patients with a similar disorder, is important additional evidence for the variant being pathogenic. Even if a missense variant is novel, that is, has never been described before, it is more likely to be pathogenic if it occurs at the same position in the protein where other known pathogenic missense mutations have occurred.

uninformative negative; that is, we do not know if the test was negative because (1) the gene or mutation responsible for disease in the proband was not covered by the test, or (2) the consultand in fact did not inherit a variant that we could have detected had we found the disease-causing variant in the affected proband in the family. Once the mutation or mutations responsible for a particular disorder are found in the proband, then the other members of the family no longer need comprehensive gene sequencing. The DNA of family members can

CHAPTER 16 — RISK ASSESSMENT AND GENETIC COUNSELING

be assessed with less expensive testing only for the presence or absence of the specific mutations already found in the family. If a family member tests negative under these circumstances, the test is a “true” negative that eliminates any elevated risk due to his or her having an affected relative. GENERAL REFERENCES Buckingham L: Molecular diagnostics: fundamentals, methods and clinical applications, ed 2, Philadelphia, 2011, F.A. Davis and Co. Gardner RJM, Sutherland GR, Shaffer LG: Chromosome abnormalities and genetic counseling, ed 4, Oxford, 2011, Oxford University Press. Harper PS: Practical genetic counseling, ed 7, London, 2010, Hodder Arnold. Uhlmann WR, Schuette JL, Yashar B: A guide to genetic counseling, New York, 2009, Wiley-Blackwell. Young ID: Introduction to risk calculation in genetic counseling, ed 3, New York, 2007, Oxford University Press.

347

REFERENCES FOR SPECIFIC TOPICS Beaulieu CL, Majewski J, Schwartzentruber J, et al: FORGE Canada Consortium: Outcomes of a 2-year national rare-disease genediscovery project, Am J Hum Genet 94:809–817, 2014. Biesecker LG, Green RC: Diagnostic clinical genome and exome sequencing, N Engl J Med 370:2418–2425, 2014. Brock JA, Allen VM, Keiser K, et al: Family history screening: use of the three generation pedigree in clinical practice, J Obstet Gynaecol Can 32:663–672, 2010. Guttmacher AE, Collins FS, Carmona RH: The family history—more important than ever, N Engl J Med 351:2333–2336, 2004. Richards CS, Bale S, Bellissimo DB, et al: ACMG recommendations for standards for interpretation and reporting of sequence variations: Revisions 2007, Genet Med 10:294–300, 2008. Sheridan E, Wright J, Small N, et al: Risk factors for congenital anomaly in a multiethnic birth cohort: an analysis of the Born in Bradford study, Lancet 382:1350–1359, 2013. Yang Y, Muzny DM, Reid JG, et al: Clinical whole-exome sequencing for the diagnosis of mendelian disorders, N Engl J Med 369:1502– 1511, 2013. Zhang VW, Wang J: Determination of the clinical significance of an unclassified variant, Methods Mol Biol 837:337–348, 2012.

PROBLEMS 1. You are consulted by a couple, Dorothy and Steven, who tell the following story. Dorothy’s maternal grandfather, Bruce, had congenital stationary night blindness, which also affected Bruce’s maternal uncle, Arthur; the family history appears to fit an X-linked inheritance pattern. (There is also an autosomal dominant form.) Whether Bruce’s mother was affected is unknown. Dorothy and Steven have three unaffected children: a daughter, Elsie, and two sons, Zack and Peter. Elsie is planning to have children in the near future. Dorothy wonders whether she should warn Elsie about the risk that she might be a carrier of a serious eye disorder. Sketch the pedigree, and answer the following. a. What is the chance that Elsie is heterozygous? b. An ophthalmologist traces the family history in further detail and finds evidence that in this pedigree, the disorder is not X-linked but autosomal dominant. There is no evidence that Dorothy’s mother Rosemary was affected. On this basis, what is the chance that Elsie is heterozygous? 2. A deceased boy, Nathan, was the only member of his family with Duchenne muscular dystrophy (DMD). He is survived by two sisters, Norma (who has a daughter, Olive) and Nancy (who has a daughter, Odette). His mother, Molly, has two sisters, Maud and Martha. Martha has two unaffected sons and two daughters, Nora and Nellie. Maud has one daughter, Naomi. No carrier tests are available because the mutation in the affected boy remains unknown. a. Sketch the pedigree, and calculate the posterior risks for all these females, using information provided in this chapter. b. Suppose prenatal diagnosis by DNA analysis is available only to women with more than a 2% risk that a pregnancy will result in a son with DMD. Which of these women would not qualify? 3. In a village in Wales in 1984, 13 boys were born in succession before a girl was born. What is the probability

of 13 successive male births? What is the probability of 13 successive births of a single sex? What is the probability that after 13 male births, the 14th child will be a boy? 4. Let H be the population frequency of carriers of hemophilia A. The incidence of hemophilia A in males (I) equals the chance that a maternal F8 gene has a new mutation (μ) from a noncarrier mother plus the chance it was inherited as a preexisting mutation from a carrier mother ( 1 2 × H). Adding these two terms gives I = μ + ( 1 2 × H). H is the chance a carrier inherits the mutation from a surviving, reproducing affected father (I × f ) (where f is the fitness of hemophilia) plus the chance of a new paternal mutation (μ) plus the chance of a new maternal mutation (μ) plus the chance of inheriting it from a carrier mother ( 1 2 × H). Adding these four terms gives H = (I × f ) + μ + μ + ( 1 2 )H. a. If hemophilia A has a fitness (f ) of approximately 0.70, that is, hemophiliacs have approximately 70% as many offspring as do controls, then what is the incidence of affected males? of carrier females? (Answer in terms of multiples of the mutation rate.) If a woman has a son with an isolated case of hemophilia A, what is the risk that she is a carrier? What is the chance that her next son will be affected? b. For DMD, f = 0. What is the population frequency of affected males? Of carrier females? c. Color blindness is thought to have normal fitness (f = 1). What is the incidence of carrier females if the frequency of color blind males is 8%? 5. Ira and Margie each have a sibling affected with cystic fibrosis. a. What are their prior risks for being carriers? b. What is the risk for their having an affected child in their first pregnancy? c. They have had three unaffected children and now wish to know their risk for having an affected child. Using Bayesian analysis to take into consideration

380

THOMPSON & THOMPSON GENETICS IN MEDICINE

in African Americans. One may anticipate that as the cost of comprehensive sequencing falls, allele-specific methods with less than 100% sensitivity may be superseded, but, for the near future, the cost-effectiveness of allele-specific methods remains a reasonable argument for their continued use in the appropriate setting. As the cost of mutation detection using allele-specific detection methods has fallen, it is becoming much less compelling that carrier screening needs to be restricted to a small number of mutant alleles common in certain ethnic groups in genes that are known to be associated with disease. It is possible now to obtain expanded carrier screening beyond disorders common to particular ethnic groups, such as cystic fibrosis, sickle cell trait, or thalassemia, to include carrier status for more than 100 additional autosomal recessive and X-linked disorders. With the use of sequencing instead of allele-specific detection methods, there is no longer any limit on which genes and which mutant alleles in these genes can theoretically be detected. Rare mutant alleles in genes associated with known disease will be found, thereby raising the sensitivity of carrier detection methods. Sequencing, however, also has the ability to uncover variants, particularly missense changes, of unknown pathogenicity in disease genes as well as in genes whose role in the disease is unknown (see Chapter 16). Unless great care is taken in assessing the clinical validity of rare variants detected by sequencing, the frequency of false-positive carrier test results will increase. The impact of carrier screening in lowering the incidence of a genetic disease can be dramatic. Carrier screening for Tay-Sachs disease in the Ashkenazi Jewish population has been carried out since 1969. Screening followed by prenatal diagnosis, when indicated, has already lowered the incidence of Tay-Sachs disease by 65% to 85% in this ethnic group. In contrast, attempts to screen for carriers of sickle cell disease in the U.S. African American community have been less effective and have had little impact on the incidence of the disease so far. The success of carrier screening programs for Tay-Sachs disease, as well as the relative failure for sickle cell anemia, underscores the importance of community consultation, community engagement, and the availability of genetic counseling and prenatal diagnosis as critical requirements for an effective program.

PERSONALIZED GENOMIC MEDICINE More than a century ago, the British physician-scientist Archibald Garrod proposed the concept of chemical individuality, in which each of us differs in our health status and susceptibility to various illnesses because of our individual genetic makeup. Indeed, in 1902, he wrote: …the factors which confer upon us our predisposition and immunities from disease are inherent in our very

chemical structure, and even in the molecular groupings which went to the making of the chromosomes from which we sprang.

The goal of personalized genomic medicine is to use knowledge of an individual’s genetic variants relevant to maintaining health or treating illness as a routine part of medical care. Now, more than a hundred years after Garrod’s visionary pronouncement, in the era of human genomics, we have the means to assess an individual’s genotype at all relevant loci by whole-genome sequencing (WGS) or, less comprehensively, by whole-exome sequencing (WES) to characterize the genetic underpinnings of each person’s unique “chemical individuality.” In addition to genomic approaches to prenatal screening of the fetus for aneuploidy by maternal cell-free DNA, as described in Chapter 17, WGS and WES are being studied for analyzing fetal DNA obtained by invasive procedures, newborn screening, screening asymptomatic adults for increased predisposition to various diseases, identifying couples that are heterozygotes for autosomal recessive or X-linked diseases that could affect their children before conception, and for finding pharmacogenetic variants relevant to drug therapy. The National Health Service of the United Kingdom is preparing to sequence the genomes of 100,000 people by 2017, with the eventual aim of having the sequence of every individual in the country in a database to use for developing personalized prevention and treatment. Hospitals, pharmaceutical companies, and the U.S. Department of Veterans Affairs are also beginning largescale sequencing of hundreds of thousands of individuals. Although these efforts are focusing initially on mining the data for genetic variants that contribute to disease or for finding novel drug targets, they are also proposing to study how to use the genomic information to design personalized prevention and treatment strategies. The application of WGS and WES to personalized medicine is not without controversy, however. One issue is cost. Although sequencing per se is many orders of magnitude less expensive now than when the original Human Genome Project was being carried out, the interpretation of such sequences remains very time consuming and expensive. Despite the time and effort put into interpretation, we are still unable to assign any clinical significance to the vast majority of all variants found through sequencing. There is widespread concern that individuals and their health care providers, when confronted with variants of uncertain significance (see Chapter 16), will seek additional expensive and unnecessary testing, with all the attendant expense and potential for complications that result from any medical test. There is the additional concern that even when a variant is known to be pathogenic and shown to be highly penetrant in families with multiple affected individuals,

CHAPTER 18 — APPLICATION OF GENOMICS TO MEDICINE AND PERSONALIZED HEALTH CARE

the actual penetrance when the variant is found through population screening in individuals with a negative family history may be much less. Personalized genomic medicine is only one component of precision medicine, which, in its broadest sense, requires health care providers to merge genomic information with other kinds of information, such as physiological or biochemical measures, developmental history, environmental exposures, and social experiences. The ultimate goal is to provide more precise diagnosis, counseling, preventive intervention, management, and therapy. This effort has begun, but a lot of work remains before personalized genomic medicine becomes a part of mainstream medicine. GENERAL REFERENCES Feero WG, Guttmacher AE, Collins FS: Genomic medicine—an updated primer, N Engl J Med 362:2001–2011, 2010. Ginsburg G, Willard HF, editors: Genomic and personalized medicine, ed 2 (vols 1 & 2), New York, 2012, Elsevier. 1305 pp. Kitzmiller JP, Groen DK, Phelps MA, et al: Pharmacogenomic testing: relevance in medical practice, Cleve Clin J Med 78:243–257, 2011. Schrodi SJ, Mukherjee S, Shan Y, et al: Genetic-based prediction of disease traits: prediction is very difficult, especially about the future, Frontiers Genet 5:1–18, 2014.

381

REFERENCES FOR SPECIFIC TOPICS Amstutz U, Carleton BC: Pharmacogenetic testing: time for clinical guidelines, Pharmacol Therapeutics 89:924–927, 2011. Bennett MJ: Newborn screening for metabolic diseases: saving children’s lives and improving outcomes, Clin Biochem 47(9):693–694, 2014. Dorschner MO, Amendola LM, Turner EH, et al: Actionable, pathogenic incidental findings in 1,000 participants’ exomes, Am J Hum Genet 93:631–640, 2013. Ferrell PB, McLeod HL: Carbamazepine, HLA-B*1502 and risk of Stevens-Johnson syndrome and toxic epidermal necrolysis: US FDA recommendations, Pharmacogenomics 9:1543–1546, 2008. Green RC, Roberts JS, Cupples LA, et al: Disclosure of APOE genotype for risk of Alzheimer’s disease, N Engl J Med 361:245–254, 2009. Kohane IS, Hsing M, Kong SW: Taxonomizing, sizing, and overcoming the incidentalome, Genet Med 14:399–404, 2012. Mallal S, Phillips E, Carosi G, et al: HLA-B*5701 screening for hypersensitivity to abacavir, N Engl J Med 358:568–579, 2008. McCarthy JJ, McLeod HL, Ginsburg GS: Genomic medicine: a decade of successes, challenges and opportunities, Sci Transl Med 5:189sr4, 2013. Topol EJ: Individualized medicine from prewomb to tomb, Cell 157:241–253, 2014. Urban TJ, Goldstein DB: Pharmacogenetics at 50: genomic personalization comes of age, Sci Transl Med 6:220ps1, 2014.

PROBLEMS 1. In a population sample of 1,000,000 Europeans, idiopathic cerebral vein thrombosis (iCVT) occurred in 18, consistent with an expected rate of 1 to 2 per 100,000. All the women were tested for factor V Leiden (FVL). Assuming an allele frequency of 2.5% for FVL, how many homozygotes and how many heterozygotes for FVL would you expect in this sample of 1,000,000 people, assuming Hardy-Weinberg equilibrium? Among the affected individuals, two were heterozygotes for FVL and one was homozygous for FVL. Set up a 3 × 2 table for the association of the homozygous FVL genotype, the heterozygous FVL genotype, and the wildtype genotype for iCVT. What is the relative risk for iCVT in a FVL heterozygote versus the wild-type genotype? What is the risk in a FVL homozygote versus wild-type? What is the sensitivity of testing positive for either one or two FVL alleles for iCVT? Finally, what is the positive predictive value of being homozygous for FVL? heterozygous? 2. In a population sample of 100,000 European women taking oral contraceptives, deep venous thrombosis (DVT) of the lower extremities occurred in 100, consistent with an expected rate of 1 per 1,000. Assuming an allele frequency of 2.5% for factor V Leiden (FVL), how many homozygotes and how many heterozygotes for FVL would you expect in this sample of 100,000 women, assuming Hardy-Weinberg equilibrium? Among the affected individuals, 58 were heterozygotes for FVL and three were homozygous for FVL. Set up a 3 × 2 table for the association of the homozygous FVL genotype, the heterozygous FVL genotype, and the wildtype genotype for DVT of the lower extremity.

What is the relative risk for DVT in a FVL heterozygote using oral contraceptives versus women with the wild-type genotype taking oral contraceptives? What is the risk in a FVL homozygote versus wild-type? What is the sensitivity of testing positive for either one or two FVL alleles for DVT while taking oral contraceptives? Finally, what is the positive predictive value for DVT of being homozygous for FVL while taking oral contraceptives? Heterozygous? 3. What steps should be taken when a phenylketonuria (PKU) screening test comes back positive? 4. Newborn screening for sickle cell disease can be performed by hemoglobin electrophoresis, which separates hemoglobin A and S, thereby identifying individuals who are heterozygotes as well as those who are homozygotes for the sickle cell mutation. What potential benefits might accrue from such testing? What harms? 5. Toxic epidermal necrolysis (TEN) and the StevensJohnson syndrome (SJS) are two related, life-threatening skin reactions that occur in approximately 1 per 100,000 individuals in China, most commonly as a result of exposure to the antiepileptic drug carbamazepine. These conditions carry a significant mortality rate of 30% to 35% (TEN) and 5% to 15% (SJS). It was observed that individuals who suffered this severe immunological reaction carried a particular major histocompatibility complex class 1 allele, HLA-B*1502, as do 8.6% of the Chinese population. In a retrospective cohort study of 145 patients who received carbamazepine therapy, 44 developed either