Computational Genetics Lecture 1

Computational Genetics Lecture 1 Background Readings: Chapter 2&3 of An introduction to Genetics, Griffiths et al. 2000, Seventh Edition (CS/Fishbach...
Author: Noah Fox
1 downloads 0 Views 1MB Size
Computational Genetics Lecture 1

Background Readings: Chapter 2&3 of An introduction to Genetics, Griffiths et al. 2000, Seventh Edition (CS/Fishbach/Other libraries). This class has been edited from several sources. Primarily from Terry Speed’s homepage at Stanford and the Technion course “Introduction to Genetics”. Changes made by Dan Geiger. .

Course Information Meetings: O Lecture, by Dan Geiger: Thursdays 14:30 –16:30, Taub 4. O Tutorial, by Ma’ayan Fishelson: Thursdays 16:30 –17:30 Grade: X 50% in five question sets. These questions sets are obligatory. Each contains 4-6 theoretical problems. Submit in pairs in two weeks time. X 50% take-home exam. (Few may be allowed to replace with a seminar lecture). Information and handouts: X

http://webcourse.technion.ac.il/236608/

X

A brochure with zeroxed material (if needed) at Taub library. 2

Course Prerequisites Computer Science and Probability Background X Algorithms 1 (cs234247) X Probability (any course) X Algorithms in computational biology (or take in parallel). Some Biology Background X Formally: None, to allow CS students to take this course. X Recommended: Introduction to Genetics (or in parallel).

3

Course Goals Learning about computational and mathematical methods for genetic analysis. X We will focus on Gene hunting – finding genes for simple human diseases. X Methods covered in depth: linkage analysis (using pedigree data), association analysis (using random samples). X Another goal is to learn more about Bayesian networks usage for genetic linkage analysis. X

4

Human Genome Most human cells contain 46 chromosomes: X

X

2 sex chromosomes (X,Y): XY – in males. XX – in females. 22 pairs of chromosomes, named autosomes. 5

Genetic Information X

X

X

Gene – basic unit of genetic information. They determine the inherited characters. Genome – the collection of genetic information. Chromosomes – storage units of genes.

6

Sexual Reproduction egg

Meiosis

sperm

gametes

zygote

7

Source: Alberts et al

The Double Helix

8

Central Dogma

‫שעתוק‬ Transcription

Gene

‫תרגום‬

Translation

mRNA

Protein

cells express different subset of the genes In different tissues and under different conditions

9

Chromosome Logical Structure Marker – Genes, SNP, Tandem repeats. Locus – location of markers. Allele – one variant form of a marker.

Locus1 Possible Alleles: A1,A2 Locus2 Possible Alleles: B1,B2,B3

10

Alleles - the ABO locus example Phenotype

Genotype

A

A/A, A/O

B

B/B, B/O

AB

A/B

O

O/O

O is recessive to A. A is dominant over O. A and B are codominant. Multiple alleles: A,B,O. Trait = Character = Phenotype

11

‫מושגים‪:‬‬ ‫‪ .1‬אלל רצסיבי ודומיננטי‪ .‬כאשר קיים בתא גם האלל הרצסיבי וגם‬ ‫הדומיננטי‪ ,‬הפנוטיפ שקובע האלל הדומיננטי משתלט‪.‬‬ ‫‪ AA .2‬ו‪ aa -‬הם הומוזיגוטים )‪ (Homozygote‬לאלל הדומיננטי‬ ‫והרצסיבי‪ ,‬בהתאמה‪ Aa .‬הוא הטרוזיגוט )‪.(Hetrozygote‬‬ ‫‪ .3‬אללים מרובים )‪,(A,B,O‬‬

‫‪12‬‬

(X-linked) ‫תאחיזה למין‬ genotype

phenotype b - dominant allele. Namely, (b,b), (b,w) is Black. X w - recessive allele. Namely, only (w,w) is White. This is an example of an X-linked (‫)תאחיזה למין‬ trait/character. For males b alone is Black and w alone is white. There is no homolog gene (‫ ) גן הומולוגי‬on the Y chromose. X

13

Mendel’s Work Modern genetics began with Mendel’s experiments on garden peas (Although, the ramification of his work were not realized during his life time). He studied seven contrasting pairs of characters, including: The form of ripe seeds: round, wrinkled The color of the seed albumen: yellow, green The length of the stem: long, short

Mendel Gregor. 1866. Experiments on Plant Hybridization. Transactions of the Brünn Natural History Society. 14

Mendel’s first law Characters are controlled by pairs of genes which separate during the formation of the reproductive cells (meiosis)

Aa

A

a

15

P:

AA X

F1:

Aa

F1 X F1

Aa Gametes:

F2:

aa

X

Aa

test cross

A

a

Gametes:

A

a

a

Aa

aa

A

AA

Aa

a

Aa

aa

1 AA : 2 Aa : 1 aa

Phenotype

~ A

Aa X

aa

~ ~ Phenotype: 1A : 1 a

~ a

16

‫מושגים‪:‬‬ ‫‪ .1‬הכלאה של ‪ F1‬על עצמו‪ :‬בדור ‪ F2‬היחס בין הצאצאים המראים‬ ‫הפנוטיפ הדומיננטי לאלו המראים הפנוטיפ הרצסיבי הוא – ‪.3:1‬‬ ‫‪ .2‬הכלאת מבחן‪ :‬הכלאת צאצאי ‪ F1‬על ההורה בעל הפנוטיפ הרצסיבי‪.‬‬ ‫היחס בין הצאצאים המראים הפנוטיפ הדומיננטי לאלו המראים הפנוטיפ‬ ‫הרצסיבי הוא – ‪1:1‬‬

‫‪17‬‬

Mendel's First low. Results of crosses in which parents differed for one character Parental Phenotype

F1

F2

F2 ratio

1. Round X wrinkled seeds

Round

5474 round; 1850 wrinkled

2.96:1

2. Yellow X green seeds

yellow

6022 yellow; 2001 green

3.01:1

3. Purple X white petals

purple

705 purple; 224 white

3.15:1

4. Inflated X pinched pods

inflated

882 inflated; 299 pinched

2.95:1

5. Green X yellow pods

green

428 green; 152 yellow

2.82:1

6. Axial X terminal flowers

axial

651 axial; 207 terminal

3.14:1

7. Long X short stems

long

787 lon; 277 short

2.84:1

Conclusion, First low: The two members of a gene pair segregate from each other into the gametes. 18

‫דוגמא לשושלת עם מוטציה רצסיבית‬ ‫)נישואין של בני דודים(‪.‬‬

‫‪19‬‬

Polydactyly – A dominant mutation

20

Brachydactyly – A dominant mutation

21

Maximum Likelihood Principle What is the probability of data for this pedigree, assuming a recessive mutation ? What is the probability of data for this pedigree, assuming a dominant mutation ?

Maximum likelihood principle: Choose the model that maximizes the probability of the data. 22

One locus: founder probabilities Founders are individuals whose parents are not in the pedigree. They may of may not be typed (namely, their genotype measured). Either way, we need to assign probabilities to their actual or possible genotypes. This is usually done by assuming Hardy-Weinberg equilibrium (H-W). If the frequency of D is .01, then H-W says:

1

Dd

pr(Dd ) = 2x.01x.99 Genotypes of founder couples are (usually) treated as independent.

1

Dd

2

dd

pr(pop Dd , mom dd ) = (2x.01x.99)x(.99)2 23

One locus: transmission probabilities Children get their genes from their parents’ genes, independently, according to Mendel’s laws; also independently for different children. Dd

1

2

3

Dd

dd

pr(kid 3 dd | pop 1 Dd & mom 2 Dd ) = 1/2 x 1/2 24

One locus: transmission probabilities - II Dd

3

dd

1

2

Dd

4

5

Dd

DD

pr(3 dd & 4 Dd & 5 DD | 1 Dd & 2 Dd ) = (1/2 x 1/2)x(2 x 1/2 x 1/2) x (1/2 x 1/2). The factor 2 comes from summing over the two mutually exclusive and equiprobable ways 4 can get a D and a d. 25

One locus: penetrance probabilities Pedigree analyses usually suppose that, given the genotype at all loci, and in some cases age and sex, the chance of having a particular phenotype depends only on genotype at one locus, and is independent of all other factors: genotypes at other loci, environment, genotypes and phenotypes of relatives, etc. Complete penetrance:

DD pr(affected | DD ) = 1 Incomplete penetrance (‫)חדירות חלקית‬:

DD pr(affected | DD ) = .8 26

One locus: penetrance - II Age and sex-dependent penetrance (liability classes) D D (45)

pr( affected | DD , male, 45 y.o. ) = .6

27

‫חדירות חלקית‪:‬‬ ‫דוגמא למוטציה דומיננטית בה הפנוטיפ המוטנטי לא תמיד מתבטא‬

‫אישה בריאה זו מעבירה לבתה‬ ‫את המוטציה הדומיננטית‪.‬‬

‫‪28‬‬

One locus: putting it all together Dd 3

2

1

5

4

dd

Dd

Dd

DD

Assume penetrances pr(affected | dd ) = .1, pr(affected | Dd ) = .3 pr(affected | DD ) = .8, and that allele D has frequency .01. The probability of data for this pedigree assuming penetrances of α1=0.1 and α2=0.3 is the product: (2 x .01 x .99 x .7) x (2 x .01 x .99 x .3) x (1/2 x 1/2 x .9) x (2 x 1/2 x 1/2 x .7) x (1/2 x 1/2 x .8) This is a function of the penetrances. By the maximum likelihood principle, the values for α1 and α1 that maximize this

probability are the ML estimates.

29

Mendel’s second law When two or more pairs of genes segregate simultaneously, they do so independently.

A a; B b

AB PAB= PA × PB

Ab PAb=PA × Pb

aB PaB=Pa × PB

ab Pab=Pa × Pb 30

31

Mendel's second low. A dihybrid cross for color and shape of pea seeds P

wrinkled and yellow X round and green rrYY RRyy

F1

round yellow Rr Yy X

F2

Rr Yy

round yellow round green wrinkled yellow wrinkled green

315 108 101 32 556 a. Check segregation pattern for each allele in F2: 416 yellow : 140 green (2.97:1) 423 round : 133 wrinkled (3.18:1)

Conclusion: both traits behave as single genes, each carrying two different alleles.

32

Question: Is there independent assortment of alleles of the different genes? ™ Probability to get yellow is 3/4; probability to get round is 3/4; probability to get yellow round is 3/4 X 3/4, namely 9/16 ™Probability to get yellow is 3/4; probability to get wrinkled is 1/4; probability to get yellow wrinkled is 3/4 X 1/4, namely 3/16 ™Probability to get green is 3/4; probability to get round is 3/4; probability to get green round is 1/4 X 3/4, namely 3/16 ™Probability to get green is 1/4; probability to get wrinkled is 1/4; probability to get green wrinkled is 1/4 X 1/4, namely 1 /16.

33

A standard presentation in terms of counts expected

expected observed

yellow round

9

312.75

315

yellow wrinkled

3

104.25

101

green round

3

104.25

108

green wrinkled

1

34.75

32

Total

16

556

556

Conclusion, second law: Different gene pairs assort independently in gamete formation 34

“Exceptions” to Mendel’s Second Law Morgan’s fruit fly data (1909): 2,839 flies Eye color A: red Wing length B: normal

a: purple b: vestigial

AABB

aabb

x AaBb

Expected Observed

AaBb 710 1,339

x

Aabb 710 151

aabb

aaBb 710 154

aabb 710 1,195

The pair AB stick together more than expected from Mendel’s law.

35

Morgan’s explanation A

A B

×

B

F1:

A

a

a

b

b

a B

×

b

a B

b

b

a

A

a

a

b

b

F2: A

a

a

b

Crossover has taken place

b

a

a b

B

b

36

Parental types: Recombinants:

AaBb, aabb Aabb, aaBb

The proportion of recombinants between the two genes (or characters) is called the recombination fraction between these two genes. It is usually denoted by r or θ. For Morgan’s traits: r = (151 + 154)/2839 = 0.107 If r < 1/2: two genes are said to be linked. If r = 1/2: independent segregation (Mendel’s second law).

37

Recombination Phenomenon (Happens during Meiosis) Male or female

Recombination Haplotype

:‫תאי מין‬ ‫ או זרע‬,‫ביצית‬

38

‫כרומוזומים מזווגים המראים כיאסמתה‬

‫הכיאסמתה היא הביטוי הציטולוגי לשחלוף‪.‬‬ ‫‪39‬‬

Example: ABO, AK1 on Chromosome 9 A A1/A1

2

1

O

O O A2 A2

A2/A2

Phase inferred

A O A1 A2

Recombinant

A A1/A2

4

3

O O A1 A2

O

A A2/A2

A |O A2 | A2

5

A1/A2

Recombination fraction is 12/100 in males and 20/100 in females. One centi-morgan means one recombination every 100 meiosis. One centi-morgan corresponds to approx 1M nucleotides (with large variance) depending on location and sex.

40

‫סימונים מוסכמים בשושלות‬

‫‪41‬‬