Hardy Weinberg Equilibrium

9/20/16 Lectures 4-11: Mechanisms of Evolution (Microevolution) Hardy Weinberg Equilibrium • • • • • • Gregor Mendel Wilhem Weinberg (1822-1884) ...
Author: Emil Manning
5 downloads 2 Views 2MB Size
9/20/16

Lectures 4-11: Mechanisms of Evolution (Microevolution)

Hardy Weinberg Equilibrium • • • • • •

Gregor Mendel

Wilhem Weinberg

(1822-1884)

G. H. Hardy (1877 - 1947)

(1862 – 1937)

Recall from Previous Lectures Darwin’s Observation

Hardy Weinberg Principle (Mendelian Inheritance) Genetic Drift Mutation Recombination Epigenetic Inheritance Natural Selection

These are mechanisms acting WITHIN populations, hence called “population genetics”—EXCEPT for epigenetic modifications, which act on individuals in a Lamarckian manner

Recall from Lecture on History of Evolutionary Thought Darwin’s Observation

Evolution acts through changes in allele frequency at each generation Leads to average change in characteristic of the population

Gregor Mendel, “Father of Modern Genetics” http://www.biography.com/people/gregor-mendel-39282#synopsis

Gregor Mendel

• Mendel presented a mechanism for how traits got passed on

HOWEVER, Darwin did not understand how genetic variation was passed on from generation to generation

Gregor Mendel, “Father of Modern Genetics” http://www.biography.com/people/gregor-mendel-39282#synopsis

Mendel’s Laws of Inheritance • Law of Segregation

Gregor Mendel

– only one allele passes from each parent on to an offspring

“Individuals pass alleles on to their offspring intact”

• Law of Independent Assortment – different pairs of alleles are passed to offspring independently of each other

(the idea of particulate (genes) inheritance) (1822-1884)

(1822-1884)

1

9/20/16

Using 29,000 pea plants, Mendel discovered the 1:3 ratio of phenotypes, due to dominant vs. recessive alleles • In cross-pollinating plants with either yellow or green peas, Mendel found that the first generation (f1) always had yellow seeds (dominance). However, the following generation (f2) consistently had a 3:1 ratio of yellow to green.

• Mendel uncovered the underlying mechanism, that there are dominant and recessive alleles

Hardy-Weinberg Principle • Mathematical description of Mendelian inheritance

Godfrey Hardy (1877-1947)

Testing for Hardy-Weinberg equilibrium can be used to assess whether a population is evolving

Wilhem Weinberg (1862 – 1937)

The Hardy-Weinberg Principle • A population that is not evolving shows allele and genotypic frequencies that are in Hardy Weinberg equilibrium • If a population is not in Hardy-Weinberg equilibrium, it can be concluded that the population is evolving

Evolutionary Mechanisms (will put population out of HW Equilibrium):

• • • •

Genetic Drift Natural Selection Mutation Migration

*Epigenetic modifications change expression of alleles but not

the frequency of alleles themselves, so they won’t affect the actual inheritance of alleles However, if you count the phenotype frequencies, and not the genotype frequencies , you might see phenotypic frequencies out of HW Equilibrium due to epigenetic silencing of alleles. (epigenetic modifications can change phenotype, not genotype)

2

9/20/16

Fig. 23-5a M AP

Requirements of HW

Violation

AR EA

Evolution

Large population size

Genetic drift

Random Mating

Inbreeding & other

No Mutations

Mutations

No Natural Selection

Natural Selection

No Migration

Migration

Beaufort Sea

•What is a “population?” A group of individuals within a species that is capable of interbreeding and producing fertile offspring

Porcupine herd range

(definition for sexual species)

Fortymile herd range

An evolving population is one that violates Hardy-Weinberg Assumptions

In the absence of Evolution…

Patterns of inheritance should always be in “Hardy Weinberg Equilibrium” Following the transmission rules of Mendel

Hardy-Weinberg Equilibrium • According to the Hardy-Weinberg principle, frequencies of alleles and genotypes in a population remain constant from generation to generation • Also, the genotype frequencies you see in a population should be the Hardy-Weinberg expectations, given the allele frequencies

“Null Model” • No Evolution: Null Model to test if no evolution is happening should simply be a population in Hardy-Weinberg Equilibrium • No Selection: Null Model to test whether Natural Selection is occurring should have no selection, but should include Genetic Drift

Example: Is this population in Hardy Weinberg Equilibrium? Generation 1 Generation 2 Generation 3

AA 0.25 0.20 0.10

Aa 0.50 0.60 0.80

aa 0.25 0.20 0.10

– This is because Genetic Drift is operating even when there is no Natural Selection

3

9/20/16

important concepts • gene: Hardy-Weinberg Theorem In a non-evolving population, frequency of alleles and genotypes remain constant over generations

You should be able to predict the genotype frequencies, given the allele frequencies

A region of genome sequence (DNA or RNA), that is the unit of inheritance , the product of which contributes to phenotype

• locus:

Location in a genome (used interchangeably with “gene,” if the location is at a gene… but, locus can be anywhere, so meaning is broader than gene)

• loci: • allele:

Plural of locus

Variant forms of a gene (e.g. alleles for different eye colors, BRCA1 breast cancer allele, etc.)

• genotype: The combination of alleles at a locus (gene) • phenotype: The expression of a trait, as a result of the

genotype and regulation of genes (green eyes, brown hair, body size, finger length, cystic fibrosis, etc.)

important concepts • allele:

Variant forms of a gene (e.g. alleles for different eye colors, BRCA1 breast cancer allele, etc.)

• We are diploid (2 chromosomes), so we have 2 alleles at a locus (any location in the genome) • However, there can be many alleles at a locus in a population. – For example, you might have inherited a blue eye allele from your mom and a brown eye allele from your dad… you can’t have more alleles than that (only 2 chromosomes, one from each parent) – BUT, there could be many alleles at this locus in the population, blue, green, grey, brown, etc.

A2

Eggs A1

So then can we predict the % of alleles and genotypes in the population at each generation?

A1

A1

A1 A3

A2 A4

A2 Sperm

A1

A3

A4 A1

A1

Random Mating (Sex) Zygotes

A1A1

A1A3

• Genotypes

A1A1 A1A1

A2A4 A3A1

Hardy-Weinberg Theorem A1

A3

A4 A1

A1

A1A1

A1A3

A1

A2

A2

Zygotes

A2

Eggs

A1 A3 A4

Sperm

• Alleles in a population of diploid organisms

In a non-evolving population, frequency of alleles and genotypes remain constant over generations

A1A1 A1A1

A2A4 A3A1

4

9/20/16

Fig. 23-6

• By convention, if there are 2 alleles at a locus, p and q are used to represent their frequencies

Alleles in the population Frequencies of alleles Gametes produced

p = frequency of CR allele = 0.8

Each egg:

q = frequency of CW allele = 0.2

80% chance

20% chance

Each sperm: 80% chance

20% chance

Hardy-Weinberg proportions indicate the expected allele and genotype frequencies, given the starting frequencies

If p and q represent the relative frequencies of the only two possible alleles in a population at a particular locus, then for a diploid organism (2 chromosomes),

• The frequency of all alleles in a population will add up to 1 – For example, p + q = 1

What about for a triploid organism?

(p + q) 2 = 1 = p2 + 2pq + q2 = 1 – where p2 and q2 represent the frequencies of the homozygous genotypes and 2pq represents the frequency of the heterozygous genotype

What about for a triploid organism? • (p + q)3 = 1 = p3 + 3p2q + 3pq2 + q3 = 1 Potential offspring: ppp, ppq, pqp, qpp, qqp, pqq, qpq, qqq How about tetraploid? You work it out.

Hardy Weinberg Theorem ALLELES Probability of A = p Probability of a = q GENOTYPES AA: p x p =

p + q = 1

p2

Aa: p x q + q x p = 2pq aa: q x q = q2 p2 + 2pq + q2 = 1

5

9/20/16

ALLELE Frequencies Frequency of A = p = 0.8 Frequency of a = q = 0.2 p+q=1

More General HW Equations • One locus three alleles: (p + q + r)2 = p2 + q2 + r2 + 2pq +2pr + 2qr

Expected GENOTYPE Frequencies AA: p x p = p2 = 0.8 x 0.8 = 0.64 Aa: p x q + q x p = 2pq = 2 x (0.8 x 0.2) = 0.32 aa: q x q = q2 = 0.2 x 0.2 = 0.04

• One locus n # alleles: (p1 + p2 + p3 + p4 … …+ pn)2 = p12 + p22 + p32 + p42… …+ pn2 + 2p1p2 + 2p1p3 + 2p2p3 + 2p1p4 + 2p1p5 + … … + 2pn-1pn • For a polyploid (more than two chromosomes): (p + q)c, where c = number of chromosomes • If multiple loci (genes) code for a trait, each locus follows the HW principle independently, and then the alleles at each loci interact to influence the trait

Hardy Weinberg Theorem ALLELE Frequency

Frequency of A = p = 0.8 Frequency of a = q = 0.2

p2 + 2pq + q2 = 0.64 + 0.32 + 0.04 = 1 Expected Allele Frequencies at 2nd Generation p = AA + Aa/2 = 0.64 + (0.32/2) = 0.8 q = aa + Aa/2 = 0.04 + (0.32/2) = 0.2

Allele frequencies remain the same at next generation

Similar example, But with different starting allele frequencies

p + q = 1

Expected GENOTYPE Frequency AA: Aa: aa :

p x p = p2 = 0.8 x 0.8 = 0.64 p x q + q x p = 2pq = 2 x (0.8 x 0.2) = 0.32 q x q = q2 = 0.2 x 0.2 = 0.04

p2 + 2pq + q2 = 0.64 + 0.32 + 0.04 = 1 p

Expected Allele Frequency at 2nd Generation

q

p = AA + Aa/2 = 0.64 + (0.32/2) = 0.8 q = aa + Aa/2 = 0.04 + (0.32/2) = 0.2

p2 2pq q2

6

9/20/16

Calculating Allele Frequencies from # of Individuals • The frequency of an allele in a population can be calculated from # of individuals: – For diploid organisms, the total number of alleles at a locus is the total number of individuals x 2 – The total number of dominant alleles at a locus is 2 alleles for each homozygous dominant individual – plus 1 allele for each heterozygous individual; the same logic applies for recessive alleles

Calculating Allele and Genotype Frequencies from # of Individuals

AA 120

Aa 60

aa 35 (# of individuals)

#A = (2 x AA) + Aa = 240 + 60 = 300 #a = (2 x aa) + Aa = 70 + 60 = 130 Proportion A = 300/total = 300/430 = 0.70 Proportion a = 130/total = 130/430 = 0.30

A + a = 0.70 + 0.30 = 1 Proportion AA = 120/215 = 0.56 Proportion Aa = 60/215 = 0.28 Proportion aa = 35/215 = 0.16

AA + Aa + aa = 0.56 + 0.28 +0.16 = 1

Applying the Hardy-Weinberg Principle • Example: estimate frequency of a disease allele in a population • Phenylketonuria (PKU) is a metabolic disorder that results from homozygosity for a recessive allele

• The occurrence of PKU is 1 per 10,000 births • How many carriers of this disease in the population?

• Individuals that are homozygous for the deleterious recessive allele cannot break down phenylalanine, results in build up à mental retardation

– Rare deleterious recessives often remain in a population because they are hidden in the heterozygous state (the “carriers”) – Natural selection can only act on the homozygous individuals where the phenotype is exposed (individuals who show symptoms of PKU)

So, let’s calculate HW frequencies • The occurrence of PKU is 1 per 10,000 births (frequency of the disease allele): q2 = 0.0001 q = sqrt(q2 ) = sqrt(0.0001) = 0.01 • The frequency of normal alleles is:

– We can assume HW equilibrium if: • There is no migration from a population with different allele frequency • Random mating • No genetic drift • Etc

p = 1 – q = 1 – 0.01 = 0.99 • The frequency of carriers (heterozygotes) of the deleterious allele is: 2pq = 2 x 0.99 x 0.01 = 0.0198 or approximately 2% of the U.S. population

7

9/20/16

Conditions for Hardy-Weinberg Equilibrium • The Hardy-Weinberg theorem describes a hypothetical population • The five conditions for nonevolving populations are rarely met in nature: – – – – –

No mutations Random mating No natural selection Extremely large population size No gene flow

DEVIATION from Hardy-Weinberg Equilibrium Indicates that

EVOLUTION Is happening

• So, in real populations, allele and genotype frequencies do change over time

Hardy-Weinberg across a Genome

Allele A1 Demo

• In natural populations, some loci might be out of HW equilibrium, while being in Hardy-Weinberg equilibrium at other loci • For example, some loci might be undergoing natural selection and become out of HW equilibrium, while the rest of the genome remains in HW equilibrium

How can you tell whether a population is out of HW Equilibrium?

• Perform HW calculations to see if it looks like the population is out of HW equilibrium • Then apply statistical tests to see if the deviation is significantly different from what you would expect by random chance

8

9/20/16

Generation 1 Generation 2 Generation 3

Example: Does this population remain in Hardy Weinberg Equilibrium across Generations? Generation 1 Generation 2 Generation 3

AA 0.25 0.20 0.10

Aa 0.50 0.60 0.80

aa 0.25 0.20 0.10

How can you tell whether a population is out of HW Equilibrium? 1. When allele frequencies are changing across generations 2. When you cannot predict genotype frequencies from allele frequencies (means there is an excess or deficit of genotypes than what would be expected given the allele frequencies)

Example • Genotype Count: AA 30 Aa 55 aa 15 • Calculate the c2 value: Genotype AA Aa aa Total

Observed Expected (O-E)2/E 30 33 0.27 55 49 0.73 15 18 0.50 100 100 1.50

• Since c2 = 1.50 < 3.841 (from Chi-square table, alpha = 0.05), we conclude that the genotype frequencies in this population are not significantly different than what would be expected if the population is in Hardy-Weinberg equilibrium.

AA 0.25 0.20 0.10

Aa 0.50 0.60 0.80

aa 0.25 0.20 0.10



In this case, allele frequencies (of A and a) did not change.



***However, the population did go out of HW equilibrium because you can no longer predict genotypic frequencies from allele frequencies



For example, p = 0.5, p2 = 0.25, but in Generation 3, the observe p2 = 0.10

Testing for Deviaton from HardyWeinberg Expectations • A c2 goodness-of-fit test can be used to determine if a population is significantly different from the expections of Hardy-Weinberg equilibrium. • If we have a series of genotype counts from a population, then we can compare these counts to the ones predicted by the Hardy-Weinberg model. • O = observed counts, E = expected counts, sum across genotypes

Testing for Deviaton from HardyWeinberg Expectations • A c2 goodness-of-fit test can be used to determine if a population is significantly different from the expections of Hardy-Weinberg equilibrium. • If we have a series of genotype counts from a population, then we can compare these counts to the ones predicted by the Hardy-Weinberg model. • O = observed counts, E = expected counts, sum across genotypes

54

9

9/20/16

Testing for Deviaton from Hardy-Weinberg Expectations • O = observed counts, E = expected counts, sum across genotypes • We test our c2 value against the Chi-square distribution (sum of square of a normal distribution), which represents the theoretical distribution of sample values under HW equilibrium à Less likely to get these values by chance

• And determine how likely it is to get our result simply by chance (e.g. due to sampling error); i.e., do our Observed values differ from our Expected values more than what we would expect by chance (= significantly different)?

55

Test for Deviation from HW equilibrium • Genotype Count Generation 4: AA 65 Aa 31 aa 4 • Calculate the c2 value: Genotype AA Aa aa Total

Observed Expected (O-E)2/E 65 64.8 0.00062 31 31.4 0.0051 4 3.8 0.0105 100 100 0.016

• Since c2 = 0.016 < 3.841 (from Chi-square table for critical values, alpha = 0.05), we conclude that the genotype frequencies in this population are not significantly different than what would be 56 expected if the population were in Hardy-Weinberg equilibrium.

• ?

• The chi-squared distribution is used because it is the sum of squared normal distributions

• • • •

Calculate Chi-squared test statistic Figure out degrees of freedom Select confidence interval (P-value) Compare your Chi-squared value to the theoretical distribution (from the table), and accept or reject the null hypothesis.

Test for Significance of Deviation from HW Equilibrium

Degrees of Freedom is n – 1 = 2 alleles (p, q) -1 = 1

– If the test statistic > than the critical value, the null hypothesis (H0 = there is no difference between the distributions) can be rejected with the selected level of confidence, and the alternative hypothesis (H1 = there is a difference between the distributions) can be accepted. – If the test statistic < than the critical value, the null hypothesis 57 cannot be rejected

Testing for significance • The results come out not significantly different from HW equilibrium • This does not necessarily mean that genetic drift is not happening, but that we cannot conclude that genetic drift is happening • Either we do not have enough power (not enough data, small sample size), or genetic drift is not happening • Sometimes it is difficult to test whether evolution is happening, even when it is happening... The signal needs to be sufficiently large to be sure that you can’t get the results by chance (like by sampling error) 59

58

Test for Deviation from HW equilibrium • Genotype Count Generation 4 à increase sample size

AA 65000 Aa 31000 aa 4000 • Calculate the c2 value: Genotype AA Aa aa Total

Observed Expected 65000 64800 31000 31400 4000 3800 100,000 100,000

(O-E)2/E 0.617 5.10 10.32 16.04

• Since c2 = 16.04 > 3.841 (from Chi-square table for critical values, alpha = 0.05), we conclude that the genotype frequencies in this population ARE significantly different than what would be 60 expected if the population were in Hardy-Weinberg equilibrium.

10

9/20/16

Test for Significance of Deviation from HW Equilibrium

• One generation of Random Mating could put a population back into Hardy Weinberg Equilibrium

Degrees of Freedom is n – 1 = 2 alleles (p, q) -1 = 1

61

What would Genetic Drift look like? Examples of Deviation from Hardy-Weinberg Equilibrium

Examples of Deviation from Hardy-Weinberg Equilibrium Generation 1 Generation 2 Generation 3 Generation 4

AA 0.64 0.63 0.64 0.65

Aa 0.32 0.33 0.315 0.31

Examples of Deviation from Hardy-Weinberg Equilibrium aa 0.04 0.04 0.045 0.04

Is this population in HW equilibrium? If not, how does it deviate? What could be the reason?

• Most populations are experiencing some level of genetic drift, unless they are incredibly large

Generation 1 Generation 2 Generation 3 Generation 4

AA 0.64 0.63 0.64 0.65

Aa 0.32 0.33 0.315 0.31

aa 0.04 0.04 0.045 0.04

This is a case of Genetic Drift, where allele frequencies are fluctuating randomly across generations

11

9/20/16

Examples of Deviation from Hardy-Weinberg Equilibrium AA 0.64

Aa 0.32

aa 0

Is this population in HW equilibrium? If not, how does it deviate? What could be the reason?

Examples of Deviation from Hardy-Weinberg Equilibrium AA 0.64

Aa 0.70

aa 0.05

Is this population in HW equilibrium? If not, how does it deviate?

aa 0

Here this appears to be Directional Selection favoring AA Or… Negative Selection disfavoring aa

Examples of Deviation from Hardy-Weinberg Equilibrium AA 0.25

Aa 0.32

Examples of Deviation from Hardy-Weinberg Equilibrium AA 0.25

Aa 0.70

aa 0.05

This appears to be a case of Heterozygote Advantage (or Overdominance)

What could be the reason?

Examples of Deviation from Hardy-Weinberg Equilibrium AA 0.10

Aa 0.10

aa 0.80

Is this population in HW equilibrium? If not, how does it deviate?

Examples of Deviation from Hardy-Weinberg Equilibrium AA 0.10

Aa 0.10

aa 0.80

Selection appears to be favoring aa

What could be the reason?

12

9/20/16

Summary

Hardy Weinberg Equilibrium

(1) A nonevolving population is in HW Equilibrium

Gregor Mendel

(2) Evolution occurs when the requirements for HW Equilibrium are not met (3) HW Equilibrium is violated when there is Genetic Drift, Migration, Mutations, Natural Selection, and Nonrandom Mating

Fig. 23-7-4

80% CR ( p = 0.8)

Fig. 23-7-1

20% CW (q = 0.2)

Sperm CR (80%)

CW (20%)

64% ( p2) CR CR

(1862 – 1937)

G. H. Hardy (1877 - 1947)

80% CR (p = 0.8)

Perform the same calculations using percentages

16% ( pq) CR CW

16% (qp) CR CW

Wilhem Weinberg

(1822-1884)

20% CW (q = 0.2)

Sperm CR (80%)

CW (20%)

64% (p 2) C RC R

16% (pq) C RC W

4% (q2) CW CW

64% CR CR, 32% CR CW, and 4% CW CW Gametes of this generation: 64% CR + 16% CR = 80% CR = 0.8 = p 4% CW

+ 16% CW

= 20% CW = 0.2 = q

Genotypes in the next generation:

16% (qp) C RC W

64% CR CR, 32% CR CW, and 4% CW CW plants

Fig. 23-7-2

4% (q 2) CW CW

Fig. 23-7-3

64% CR CR , 32% CR CW , and 4% CW CW Gametes of this generation:

64% CR CR , 32% CR CW , and 4% CW CW

64% C + 16% CR

= 80% CR = 0.8 = p

4% CW

= 20% CW = 0.2 = q

R

Gametes of this generation: 64% C + 16% CR

= 80% CR = 0.8 = p

4% CW

= 20% CW = 0.2 = q

R

+ 16% CW

+ 16% CW

Genotypes in the next generation:

64% CR CR , 32% CR CW , and 4% CW CW plants

13

9/20/16

Gregor Mendel

1. Nabila is a Saudi Princess who is arranged to marry her first cousin. Many in her family have died of a rare blood disease, which sometimes skips generations, and thus appears to be recessive. Nabila thinks that she is a carrier of this disease. If her fiancé is also a carrier, what is the probability that her offspring will have (be afflicted with) the disease? (A) 1/4 (B) 1/3 (C) 1/2 (D) 3/4 (E) zero

The following are numbers of pink and white flowers in a population.

Generation 1:

Pink 901

White 302

Generation 2:

1204

403

Generation 3:

1510

504

2. Which of the following is most likely to be TRUE? (A) The heterozygotes are probably pink (B) The recessive allele here (probably white) is clearly deleterious (C) Evolution is occurring, as allele frequencies are changing greatly over time (D) Clearly there is a heterozygote advantage (E) The frequencies above violate Hardy-Weinberg expectations

The following are numbers of purple and white peas in a population. (A1A1) (A1A2) (A2A2) Purple Purple White Generation 1: 360 480 160 Generation 2: 100 200 200 Generation 3: 0 100 300 3. What are the genotype frequencies at each generation? (A) Generation 1: 0.30, 0.50, 0.20 Generation 2: 0.20, 0.40, 0.40 Generation 3: 0, 0.333, 0.666 (B) Generation 1: 0.36, 0.48, 0.16 Generation 2: 0.10, 0.20, 0.20 Generation 3: 0, 0.10, 0.30 (C) Generation 1: 0.36, 0.48, 0.16 Generation 2: 0.20, 0.40, 0.40 Generation 3: 0, 0.25, 0.75 (D) Generation 1: 0.36, 0.48, 0.16 Generation 2: 0.36, 0.48, 0.16 Generation 3: 0.36, 0.48, 0.16

4. From the example on the previous slide, what are the frequencies of alleles at each generation? (A) Generation1: Dominant allele (A1) = 0.6, Recessive allele (A2) = 0.4 Generation2: Dominant allele = 0.4, Recessive allele = 0.6 Generation3: Dominant allele = 0.125, Recessive allele = 0.875 (B) Generation1: Dominant allele = 0.6, Recessive allele = 0.4 Generation2: Dominant allele = 0.6, Recessive allele = 0.4 Generation3: Dominant allele = 0.6, Recessive allele = 0.4 (C) Generation1: Dominant allele = 0.6, Recessive allele = 0.4 Generation2: Dominant allele = 0.5, Recessive allele = 0.5 Generation3: Dominant allele = 0.25, Recessive allele = 0.75

5. From the example two slides ago, which evolutionary mechanism might be operating across generations? (A) Mutation (B) Selection favoring A1 (C) Heterozygote advantage (D) Selection favoring A2 (E) Inbreeding

(D) Generation1: Dominant allele = 0.4, Recessive allele = 0.6 Generation2: Dominant allele = 0.5, Recessive allele = 0.5 Generation3: Dominant allele = 0.25, Recessive allele = 0.75

14

9/20/16

Answers: 1. Parents: Aa x Aa = Offspring: AA (25%), Aa (50%), aa (25%) Answer = A 2. A 3. C 4. A 5. D

15

Suggest Documents