9/20/16
Lectures 4-11: Mechanisms of Evolution (Microevolution)
Hardy Weinberg Equilibrium • • • • • •
Gregor Mendel
Wilhem Weinberg
(1822-1884)
G. H. Hardy (1877 - 1947)
(1862 – 1937)
Recall from Previous Lectures Darwin’s Observation
Hardy Weinberg Principle (Mendelian Inheritance) Genetic Drift Mutation Recombination Epigenetic Inheritance Natural Selection
These are mechanisms acting WITHIN populations, hence called “population genetics”—EXCEPT for epigenetic modifications, which act on individuals in a Lamarckian manner
Recall from Lecture on History of Evolutionary Thought Darwin’s Observation
Evolution acts through changes in allele frequency at each generation Leads to average change in characteristic of the population
Gregor Mendel, “Father of Modern Genetics” http://www.biography.com/people/gregor-mendel-39282#synopsis
Gregor Mendel
• Mendel presented a mechanism for how traits got passed on
HOWEVER, Darwin did not understand how genetic variation was passed on from generation to generation
Gregor Mendel, “Father of Modern Genetics” http://www.biography.com/people/gregor-mendel-39282#synopsis
Mendel’s Laws of Inheritance • Law of Segregation
Gregor Mendel
– only one allele passes from each parent on to an offspring
“Individuals pass alleles on to their offspring intact”
• Law of Independent Assortment – different pairs of alleles are passed to offspring independently of each other
(the idea of particulate (genes) inheritance) (1822-1884)
(1822-1884)
1
9/20/16
Using 29,000 pea plants, Mendel discovered the 1:3 ratio of phenotypes, due to dominant vs. recessive alleles • In cross-pollinating plants with either yellow or green peas, Mendel found that the first generation (f1) always had yellow seeds (dominance). However, the following generation (f2) consistently had a 3:1 ratio of yellow to green.
• Mendel uncovered the underlying mechanism, that there are dominant and recessive alleles
Hardy-Weinberg Principle • Mathematical description of Mendelian inheritance
Godfrey Hardy (1877-1947)
Testing for Hardy-Weinberg equilibrium can be used to assess whether a population is evolving
Wilhem Weinberg (1862 – 1937)
The Hardy-Weinberg Principle • A population that is not evolving shows allele and genotypic frequencies that are in Hardy Weinberg equilibrium • If a population is not in Hardy-Weinberg equilibrium, it can be concluded that the population is evolving
Evolutionary Mechanisms (will put population out of HW Equilibrium):
• • • •
Genetic Drift Natural Selection Mutation Migration
*Epigenetic modifications change expression of alleles but not
the frequency of alleles themselves, so they won’t affect the actual inheritance of alleles However, if you count the phenotype frequencies, and not the genotype frequencies , you might see phenotypic frequencies out of HW Equilibrium due to epigenetic silencing of alleles. (epigenetic modifications can change phenotype, not genotype)
2
9/20/16
Fig. 23-5a M AP
Requirements of HW
Violation
AR EA
Evolution
Large population size
Genetic drift
Random Mating
Inbreeding & other
No Mutations
Mutations
No Natural Selection
Natural Selection
No Migration
Migration
Beaufort Sea
•What is a “population?” A group of individuals within a species that is capable of interbreeding and producing fertile offspring
Porcupine herd range
(definition for sexual species)
Fortymile herd range
An evolving population is one that violates Hardy-Weinberg Assumptions
In the absence of Evolution…
Patterns of inheritance should always be in “Hardy Weinberg Equilibrium” Following the transmission rules of Mendel
Hardy-Weinberg Equilibrium • According to the Hardy-Weinberg principle, frequencies of alleles and genotypes in a population remain constant from generation to generation • Also, the genotype frequencies you see in a population should be the Hardy-Weinberg expectations, given the allele frequencies
“Null Model” • No Evolution: Null Model to test if no evolution is happening should simply be a population in Hardy-Weinberg Equilibrium • No Selection: Null Model to test whether Natural Selection is occurring should have no selection, but should include Genetic Drift
Example: Is this population in Hardy Weinberg Equilibrium? Generation 1 Generation 2 Generation 3
AA 0.25 0.20 0.10
Aa 0.50 0.60 0.80
aa 0.25 0.20 0.10
– This is because Genetic Drift is operating even when there is no Natural Selection
3
9/20/16
important concepts • gene: Hardy-Weinberg Theorem In a non-evolving population, frequency of alleles and genotypes remain constant over generations
You should be able to predict the genotype frequencies, given the allele frequencies
A region of genome sequence (DNA or RNA), that is the unit of inheritance , the product of which contributes to phenotype
• locus:
Location in a genome (used interchangeably with “gene,” if the location is at a gene… but, locus can be anywhere, so meaning is broader than gene)
• loci: • allele:
Plural of locus
Variant forms of a gene (e.g. alleles for different eye colors, BRCA1 breast cancer allele, etc.)
• genotype: The combination of alleles at a locus (gene) • phenotype: The expression of a trait, as a result of the
genotype and regulation of genes (green eyes, brown hair, body size, finger length, cystic fibrosis, etc.)
important concepts • allele:
Variant forms of a gene (e.g. alleles for different eye colors, BRCA1 breast cancer allele, etc.)
• We are diploid (2 chromosomes), so we have 2 alleles at a locus (any location in the genome) • However, there can be many alleles at a locus in a population. – For example, you might have inherited a blue eye allele from your mom and a brown eye allele from your dad… you can’t have more alleles than that (only 2 chromosomes, one from each parent) – BUT, there could be many alleles at this locus in the population, blue, green, grey, brown, etc.
A2
Eggs A1
So then can we predict the % of alleles and genotypes in the population at each generation?
A1
A1
A1 A3
A2 A4
A2 Sperm
A1
A3
A4 A1
A1
Random Mating (Sex) Zygotes
A1A1
A1A3
• Genotypes
A1A1 A1A1
A2A4 A3A1
Hardy-Weinberg Theorem A1
A3
A4 A1
A1
A1A1
A1A3
A1
A2
A2
Zygotes
A2
Eggs
A1 A3 A4
Sperm
• Alleles in a population of diploid organisms
In a non-evolving population, frequency of alleles and genotypes remain constant over generations
A1A1 A1A1
A2A4 A3A1
4
9/20/16
Fig. 23-6
• By convention, if there are 2 alleles at a locus, p and q are used to represent their frequencies
Alleles in the population Frequencies of alleles Gametes produced
p = frequency of CR allele = 0.8
Each egg:
q = frequency of CW allele = 0.2
80% chance
20% chance
Each sperm: 80% chance
20% chance
Hardy-Weinberg proportions indicate the expected allele and genotype frequencies, given the starting frequencies
If p and q represent the relative frequencies of the only two possible alleles in a population at a particular locus, then for a diploid organism (2 chromosomes),
• The frequency of all alleles in a population will add up to 1 – For example, p + q = 1
What about for a triploid organism?
(p + q) 2 = 1 = p2 + 2pq + q2 = 1 – where p2 and q2 represent the frequencies of the homozygous genotypes and 2pq represents the frequency of the heterozygous genotype
What about for a triploid organism? • (p + q)3 = 1 = p3 + 3p2q + 3pq2 + q3 = 1 Potential offspring: ppp, ppq, pqp, qpp, qqp, pqq, qpq, qqq How about tetraploid? You work it out.
Hardy Weinberg Theorem ALLELES Probability of A = p Probability of a = q GENOTYPES AA: p x p =
p + q = 1
p2
Aa: p x q + q x p = 2pq aa: q x q = q2 p2 + 2pq + q2 = 1
5
9/20/16
ALLELE Frequencies Frequency of A = p = 0.8 Frequency of a = q = 0.2 p+q=1
More General HW Equations • One locus three alleles: (p + q + r)2 = p2 + q2 + r2 + 2pq +2pr + 2qr
Expected GENOTYPE Frequencies AA: p x p = p2 = 0.8 x 0.8 = 0.64 Aa: p x q + q x p = 2pq = 2 x (0.8 x 0.2) = 0.32 aa: q x q = q2 = 0.2 x 0.2 = 0.04
• One locus n # alleles: (p1 + p2 + p3 + p4 … …+ pn)2 = p12 + p22 + p32 + p42… …+ pn2 + 2p1p2 + 2p1p3 + 2p2p3 + 2p1p4 + 2p1p5 + … … + 2pn-1pn • For a polyploid (more than two chromosomes): (p + q)c, where c = number of chromosomes • If multiple loci (genes) code for a trait, each locus follows the HW principle independently, and then the alleles at each loci interact to influence the trait
Hardy Weinberg Theorem ALLELE Frequency
Frequency of A = p = 0.8 Frequency of a = q = 0.2
p2 + 2pq + q2 = 0.64 + 0.32 + 0.04 = 1 Expected Allele Frequencies at 2nd Generation p = AA + Aa/2 = 0.64 + (0.32/2) = 0.8 q = aa + Aa/2 = 0.04 + (0.32/2) = 0.2
Allele frequencies remain the same at next generation
Similar example, But with different starting allele frequencies
p + q = 1
Expected GENOTYPE Frequency AA: Aa: aa :
p x p = p2 = 0.8 x 0.8 = 0.64 p x q + q x p = 2pq = 2 x (0.8 x 0.2) = 0.32 q x q = q2 = 0.2 x 0.2 = 0.04
p2 + 2pq + q2 = 0.64 + 0.32 + 0.04 = 1 p
Expected Allele Frequency at 2nd Generation
q
p = AA + Aa/2 = 0.64 + (0.32/2) = 0.8 q = aa + Aa/2 = 0.04 + (0.32/2) = 0.2
p2 2pq q2
6
9/20/16
Calculating Allele Frequencies from # of Individuals • The frequency of an allele in a population can be calculated from # of individuals: – For diploid organisms, the total number of alleles at a locus is the total number of individuals x 2 – The total number of dominant alleles at a locus is 2 alleles for each homozygous dominant individual – plus 1 allele for each heterozygous individual; the same logic applies for recessive alleles
Calculating Allele and Genotype Frequencies from # of Individuals
AA 120
Aa 60
aa 35 (# of individuals)
#A = (2 x AA) + Aa = 240 + 60 = 300 #a = (2 x aa) + Aa = 70 + 60 = 130 Proportion A = 300/total = 300/430 = 0.70 Proportion a = 130/total = 130/430 = 0.30
A + a = 0.70 + 0.30 = 1 Proportion AA = 120/215 = 0.56 Proportion Aa = 60/215 = 0.28 Proportion aa = 35/215 = 0.16
AA + Aa + aa = 0.56 + 0.28 +0.16 = 1
Applying the Hardy-Weinberg Principle • Example: estimate frequency of a disease allele in a population • Phenylketonuria (PKU) is a metabolic disorder that results from homozygosity for a recessive allele
• The occurrence of PKU is 1 per 10,000 births • How many carriers of this disease in the population?
• Individuals that are homozygous for the deleterious recessive allele cannot break down phenylalanine, results in build up à mental retardation
– Rare deleterious recessives often remain in a population because they are hidden in the heterozygous state (the “carriers”) – Natural selection can only act on the homozygous individuals where the phenotype is exposed (individuals who show symptoms of PKU)
So, let’s calculate HW frequencies • The occurrence of PKU is 1 per 10,000 births (frequency of the disease allele): q2 = 0.0001 q = sqrt(q2 ) = sqrt(0.0001) = 0.01 • The frequency of normal alleles is:
– We can assume HW equilibrium if: • There is no migration from a population with different allele frequency • Random mating • No genetic drift • Etc
p = 1 – q = 1 – 0.01 = 0.99 • The frequency of carriers (heterozygotes) of the deleterious allele is: 2pq = 2 x 0.99 x 0.01 = 0.0198 or approximately 2% of the U.S. population
7
9/20/16
Conditions for Hardy-Weinberg Equilibrium • The Hardy-Weinberg theorem describes a hypothetical population • The five conditions for nonevolving populations are rarely met in nature: – – – – –
No mutations Random mating No natural selection Extremely large population size No gene flow
DEVIATION from Hardy-Weinberg Equilibrium Indicates that
EVOLUTION Is happening
• So, in real populations, allele and genotype frequencies do change over time
Hardy-Weinberg across a Genome
Allele A1 Demo
• In natural populations, some loci might be out of HW equilibrium, while being in Hardy-Weinberg equilibrium at other loci • For example, some loci might be undergoing natural selection and become out of HW equilibrium, while the rest of the genome remains in HW equilibrium
How can you tell whether a population is out of HW Equilibrium?
• Perform HW calculations to see if it looks like the population is out of HW equilibrium • Then apply statistical tests to see if the deviation is significantly different from what you would expect by random chance
8
9/20/16
Generation 1 Generation 2 Generation 3
Example: Does this population remain in Hardy Weinberg Equilibrium across Generations? Generation 1 Generation 2 Generation 3
AA 0.25 0.20 0.10
Aa 0.50 0.60 0.80
aa 0.25 0.20 0.10
How can you tell whether a population is out of HW Equilibrium? 1. When allele frequencies are changing across generations 2. When you cannot predict genotype frequencies from allele frequencies (means there is an excess or deficit of genotypes than what would be expected given the allele frequencies)
Example • Genotype Count: AA 30 Aa 55 aa 15 • Calculate the c2 value: Genotype AA Aa aa Total
Observed Expected (O-E)2/E 30 33 0.27 55 49 0.73 15 18 0.50 100 100 1.50
• Since c2 = 1.50 < 3.841 (from Chi-square table, alpha = 0.05), we conclude that the genotype frequencies in this population are not significantly different than what would be expected if the population is in Hardy-Weinberg equilibrium.
AA 0.25 0.20 0.10
Aa 0.50 0.60 0.80
aa 0.25 0.20 0.10
■
In this case, allele frequencies (of A and a) did not change.
■
***However, the population did go out of HW equilibrium because you can no longer predict genotypic frequencies from allele frequencies
■
For example, p = 0.5, p2 = 0.25, but in Generation 3, the observe p2 = 0.10
Testing for Deviaton from HardyWeinberg Expectations • A c2 goodness-of-fit test can be used to determine if a population is significantly different from the expections of Hardy-Weinberg equilibrium. • If we have a series of genotype counts from a population, then we can compare these counts to the ones predicted by the Hardy-Weinberg model. • O = observed counts, E = expected counts, sum across genotypes
Testing for Deviaton from HardyWeinberg Expectations • A c2 goodness-of-fit test can be used to determine if a population is significantly different from the expections of Hardy-Weinberg equilibrium. • If we have a series of genotype counts from a population, then we can compare these counts to the ones predicted by the Hardy-Weinberg model. • O = observed counts, E = expected counts, sum across genotypes
54
9
9/20/16
Testing for Deviaton from Hardy-Weinberg Expectations • O = observed counts, E = expected counts, sum across genotypes • We test our c2 value against the Chi-square distribution (sum of square of a normal distribution), which represents the theoretical distribution of sample values under HW equilibrium à Less likely to get these values by chance
• And determine how likely it is to get our result simply by chance (e.g. due to sampling error); i.e., do our Observed values differ from our Expected values more than what we would expect by chance (= significantly different)?
55
Test for Deviation from HW equilibrium • Genotype Count Generation 4: AA 65 Aa 31 aa 4 • Calculate the c2 value: Genotype AA Aa aa Total
Observed Expected (O-E)2/E 65 64.8 0.00062 31 31.4 0.0051 4 3.8 0.0105 100 100 0.016
• Since c2 = 0.016 < 3.841 (from Chi-square table for critical values, alpha = 0.05), we conclude that the genotype frequencies in this population are not significantly different than what would be 56 expected if the population were in Hardy-Weinberg equilibrium.
• ?
• The chi-squared distribution is used because it is the sum of squared normal distributions
• • • •
Calculate Chi-squared test statistic Figure out degrees of freedom Select confidence interval (P-value) Compare your Chi-squared value to the theoretical distribution (from the table), and accept or reject the null hypothesis.
Test for Significance of Deviation from HW Equilibrium
Degrees of Freedom is n – 1 = 2 alleles (p, q) -1 = 1
– If the test statistic > than the critical value, the null hypothesis (H0 = there is no difference between the distributions) can be rejected with the selected level of confidence, and the alternative hypothesis (H1 = there is a difference between the distributions) can be accepted. – If the test statistic < than the critical value, the null hypothesis 57 cannot be rejected
Testing for significance • The results come out not significantly different from HW equilibrium • This does not necessarily mean that genetic drift is not happening, but that we cannot conclude that genetic drift is happening • Either we do not have enough power (not enough data, small sample size), or genetic drift is not happening • Sometimes it is difficult to test whether evolution is happening, even when it is happening... The signal needs to be sufficiently large to be sure that you can’t get the results by chance (like by sampling error) 59
58
Test for Deviation from HW equilibrium • Genotype Count Generation 4 à increase sample size
AA 65000 Aa 31000 aa 4000 • Calculate the c2 value: Genotype AA Aa aa Total
Observed Expected 65000 64800 31000 31400 4000 3800 100,000 100,000
(O-E)2/E 0.617 5.10 10.32 16.04
• Since c2 = 16.04 > 3.841 (from Chi-square table for critical values, alpha = 0.05), we conclude that the genotype frequencies in this population ARE significantly different than what would be 60 expected if the population were in Hardy-Weinberg equilibrium.
10
9/20/16
Test for Significance of Deviation from HW Equilibrium
• One generation of Random Mating could put a population back into Hardy Weinberg Equilibrium
Degrees of Freedom is n – 1 = 2 alleles (p, q) -1 = 1
61
What would Genetic Drift look like? Examples of Deviation from Hardy-Weinberg Equilibrium
Examples of Deviation from Hardy-Weinberg Equilibrium Generation 1 Generation 2 Generation 3 Generation 4
AA 0.64 0.63 0.64 0.65
Aa 0.32 0.33 0.315 0.31
Examples of Deviation from Hardy-Weinberg Equilibrium aa 0.04 0.04 0.045 0.04
Is this population in HW equilibrium? If not, how does it deviate? What could be the reason?
• Most populations are experiencing some level of genetic drift, unless they are incredibly large
Generation 1 Generation 2 Generation 3 Generation 4
AA 0.64 0.63 0.64 0.65
Aa 0.32 0.33 0.315 0.31
aa 0.04 0.04 0.045 0.04
This is a case of Genetic Drift, where allele frequencies are fluctuating randomly across generations
11
9/20/16
Examples of Deviation from Hardy-Weinberg Equilibrium AA 0.64
Aa 0.32
aa 0
Is this population in HW equilibrium? If not, how does it deviate? What could be the reason?
Examples of Deviation from Hardy-Weinberg Equilibrium AA 0.64
Aa 0.70
aa 0.05
Is this population in HW equilibrium? If not, how does it deviate?
aa 0
Here this appears to be Directional Selection favoring AA Or… Negative Selection disfavoring aa
Examples of Deviation from Hardy-Weinberg Equilibrium AA 0.25
Aa 0.32
Examples of Deviation from Hardy-Weinberg Equilibrium AA 0.25
Aa 0.70
aa 0.05
This appears to be a case of Heterozygote Advantage (or Overdominance)
What could be the reason?
Examples of Deviation from Hardy-Weinberg Equilibrium AA 0.10
Aa 0.10
aa 0.80
Is this population in HW equilibrium? If not, how does it deviate?
Examples of Deviation from Hardy-Weinberg Equilibrium AA 0.10
Aa 0.10
aa 0.80
Selection appears to be favoring aa
What could be the reason?
12
9/20/16
Summary
Hardy Weinberg Equilibrium
(1) A nonevolving population is in HW Equilibrium
Gregor Mendel
(2) Evolution occurs when the requirements for HW Equilibrium are not met (3) HW Equilibrium is violated when there is Genetic Drift, Migration, Mutations, Natural Selection, and Nonrandom Mating
Fig. 23-7-4
80% CR ( p = 0.8)
Fig. 23-7-1
20% CW (q = 0.2)
Sperm CR (80%)
CW (20%)
64% ( p2) CR CR
(1862 – 1937)
G. H. Hardy (1877 - 1947)
80% CR (p = 0.8)
Perform the same calculations using percentages
16% ( pq) CR CW
16% (qp) CR CW
Wilhem Weinberg
(1822-1884)
20% CW (q = 0.2)
Sperm CR (80%)
CW (20%)
64% (p 2) C RC R
16% (pq) C RC W
4% (q2) CW CW
64% CR CR, 32% CR CW, and 4% CW CW Gametes of this generation: 64% CR + 16% CR = 80% CR = 0.8 = p 4% CW
+ 16% CW
= 20% CW = 0.2 = q
Genotypes in the next generation:
16% (qp) C RC W
64% CR CR, 32% CR CW, and 4% CW CW plants
Fig. 23-7-2
4% (q 2) CW CW
Fig. 23-7-3
64% CR CR , 32% CR CW , and 4% CW CW Gametes of this generation:
64% CR CR , 32% CR CW , and 4% CW CW
64% C + 16% CR
= 80% CR = 0.8 = p
4% CW
= 20% CW = 0.2 = q
R
Gametes of this generation: 64% C + 16% CR
= 80% CR = 0.8 = p
4% CW
= 20% CW = 0.2 = q
R
+ 16% CW
+ 16% CW
Genotypes in the next generation:
64% CR CR , 32% CR CW , and 4% CW CW plants
13
9/20/16
Gregor Mendel
1. Nabila is a Saudi Princess who is arranged to marry her first cousin. Many in her family have died of a rare blood disease, which sometimes skips generations, and thus appears to be recessive. Nabila thinks that she is a carrier of this disease. If her fiancé is also a carrier, what is the probability that her offspring will have (be afflicted with) the disease? (A) 1/4 (B) 1/3 (C) 1/2 (D) 3/4 (E) zero
The following are numbers of pink and white flowers in a population.
Generation 1:
Pink 901
White 302
Generation 2:
1204
403
Generation 3:
1510
504
2. Which of the following is most likely to be TRUE? (A) The heterozygotes are probably pink (B) The recessive allele here (probably white) is clearly deleterious (C) Evolution is occurring, as allele frequencies are changing greatly over time (D) Clearly there is a heterozygote advantage (E) The frequencies above violate Hardy-Weinberg expectations
The following are numbers of purple and white peas in a population. (A1A1) (A1A2) (A2A2) Purple Purple White Generation 1: 360 480 160 Generation 2: 100 200 200 Generation 3: 0 100 300 3. What are the genotype frequencies at each generation? (A) Generation 1: 0.30, 0.50, 0.20 Generation 2: 0.20, 0.40, 0.40 Generation 3: 0, 0.333, 0.666 (B) Generation 1: 0.36, 0.48, 0.16 Generation 2: 0.10, 0.20, 0.20 Generation 3: 0, 0.10, 0.30 (C) Generation 1: 0.36, 0.48, 0.16 Generation 2: 0.20, 0.40, 0.40 Generation 3: 0, 0.25, 0.75 (D) Generation 1: 0.36, 0.48, 0.16 Generation 2: 0.36, 0.48, 0.16 Generation 3: 0.36, 0.48, 0.16
4. From the example on the previous slide, what are the frequencies of alleles at each generation? (A) Generation1: Dominant allele (A1) = 0.6, Recessive allele (A2) = 0.4 Generation2: Dominant allele = 0.4, Recessive allele = 0.6 Generation3: Dominant allele = 0.125, Recessive allele = 0.875 (B) Generation1: Dominant allele = 0.6, Recessive allele = 0.4 Generation2: Dominant allele = 0.6, Recessive allele = 0.4 Generation3: Dominant allele = 0.6, Recessive allele = 0.4 (C) Generation1: Dominant allele = 0.6, Recessive allele = 0.4 Generation2: Dominant allele = 0.5, Recessive allele = 0.5 Generation3: Dominant allele = 0.25, Recessive allele = 0.75
5. From the example two slides ago, which evolutionary mechanism might be operating across generations? (A) Mutation (B) Selection favoring A1 (C) Heterozygote advantage (D) Selection favoring A2 (E) Inbreeding
(D) Generation1: Dominant allele = 0.4, Recessive allele = 0.6 Generation2: Dominant allele = 0.5, Recessive allele = 0.5 Generation3: Dominant allele = 0.25, Recessive allele = 0.75
14
9/20/16
Answers: 1. Parents: Aa x Aa = Offspring: AA (25%), Aa (50%), aa (25%) Answer = A 2. A 3. C 4. A 5. D
15