Markers. Genomic Selection. Marker and selection. Prove di progenie e MAS. Marcatori molecolari. Genetic markers

Markers •  Phenotypic markers Genomic Selection –  Coat color –  Blood type –  Polledness •  Genetic markers / molecular markers Alessandro Bagnato...
Author: Nigel Goodwin
6 downloads 0 Views 1MB Size
Markers •  Phenotypic markers

Genomic Selection

–  Coat color –  Blood type –  Polledness

•  Genetic markers / molecular markers Alessandro Bagnato

–  DNA •  Can be visualised thanks to technology •  Close relatedness to available technology

Marker and selection •  Microsatellites Markers –  Marker Assisted Selection (MAS)

•  New Technologies: –  Sequencers –  Genotyping techniques

•  Cow Genome Sequenced –  SNP Markers –  Genomic Selection •  Cattle (Dairy) – Outbred populations •  Pig, Poultry – Inbreed populations

Marcatori molecolari •  Marcatori molecolari –  Mutazioni causali •  Direttamente studiabili per l'effetto che le loro mutazioni inducono sull’espressione fenotipica del carettere –  Esempio: le caseine nel bovino (αs1 αs2 β e k)

–  "Anonimi" a funzione ignota •  Mutazioni genomiche non causali ma a comportamento mendeliano

Prove di progenie e MAS •  1967: Suggerito l’utilizzo di marcatori per il miglioramento genetico (MAS) •  1980: Progetto Genoma Umano •  1985: Sviluppato il termociclatore •  1985: Identificati i QTL nei bovini da latte •  1990: Proposto il Granddaughter design •  1990s: Applicata MAS Bell locus •  2009 Utilizzo di selezione genomica

Genetic markers –  Microsatelliti (Variable Number of Tandem Repeats - VNTR) •  Brevi ripetizioni di nucleotidi lungo il DNA •  Frequenti le ripetizioni di AC AT CAC GATA •  ACACACACACACACAC •  ATATATATATATAT •  CACCACCACCACCAC •  Nel genoma migliaia di microsatelliti •  Loci mendeliani a tutti gli effetti

–  Microsatelliti, SNP, CNV

1

Genetic markers

standard

DNA-microsatellites! Individual heterozygous Individual homozygous

---> ------CACACACACACACACACACACACACACA-----------GTGTGTGTGTGTGTGTGTGTGTGTGTGT----- ------CACACACACACACACACACACACACACACACA-----------GTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGT----- new possibilities

Technology and Information •  2003 human genome delivered

•  Genomic technology –  SNP genotyping platforms –  Second / third generation sequencers –  Production of genomic information at a very low cost per individual

–  After many years (US$100’s M)

•  2006 cattle genome delivered –  Two years (US$ 50M)

•  2008/2010 reality

•  Phenotype recording

–  777,000+ SNP panel for cattle ~US$ 0.0003/SNP –  54,000+ SNP panels for pigs, sheep and chickens –  ~US$ 0.003/SNP

–  Automation in phenotypic recording –  E.g. NIR technology for milk quality traits

•  Trait ontology •  Reproductive technologies

•  2011 reality

–  Embryo Transfer –  Semen Sexing

–  1 cow sequenced for less than 10,000 US$

•  2012…. Whole genome re-sequencing genotyping

•  Association of genomic information to phenotypic information

–  1 cow sequenced for less than 1,000 US$

•  2015 …. ????

–  Genomic selection

The Principle of ‘Genomic Selection’

Copyright  2001 by the Genetics Society of America

Prediction of Total Genetic Value Using Genome-Wide Dense Marker Maps

Training

T. H. E. Meuwissen,* B. J. Hayes† and M. E. Goddard†,‡ *Research Institute of Animal Science and Health, 8200 AB Lelystad, The Netherlands, †Victorian Institute of Animal Science, Attwood 3049, Victoria, Australia and ‡Institute of Land and Food Resources, University of Melbourne, Parkville 3052, Victoria, Australia

Measured traits + Thousand of markers

Manuscript received August 17, 2000 Accepted for publication January 17, 2001 ABSTRACT Recent advances in molecular genetic techniques will make dense marker maps available and genotyping many individuals for these markers feasible. Here we attempted to estimate the effects of z50,000 marker haplotypes simultaneously from a limited number of phenotypic records. A genome of 1000 cM was simulated with a marker spacing of 1 cM. The markers surrounding every 1-cM region were combined into marker haplotypes. Due to finite population size (Ne 5 100), the marker haplotypes were in linkage disequilibrium with the QTL located between the markers. Using least squares, all haplotype effects could not be estimated simultaneously. When only the biggest effects were included, they were overestimated and the accuracy of predicting genetic values of the offspring of the recorded animals was only 0.32. Best linear unbiased prediction of haplotype effects assumed equal variances associated to each 1-cM chromosomal segment, which yielded an accuracy of 0.73, although this assumption was far from true. Bayesian methods that assumed a prior distribution of the variance associated with each chromosome segment increased this accuracy to 0.85, even when the prior was not correct. It was concluded that selection on genetic values predicted from markers could substantially increase the rate of genetic gain in animals and plants, especially if combined with reproductive techniques to shorten the generation interval.

S

ELECTION for economically important quantitative traits in animals and plants is traditionally based on phenotypic records of the individual and its relatives. Estimated breeding values, based on this phenotypic data, are commonly calculated by best linear unbiased prediction (BLUP; Henderson 1984). One justification for molecular genetics research on livestock and crop species is the expectation that information at the DNA level will lead to faster genetic gain than that achieved based on phenotypic data only. The availability of a sparse map of genetic markers has resulted in the detection of some quantitative trait loci (QTL; Georges et al. 1995). The inclusion of marker information into BLUP breeding values was demonstrated by Fernando and Grossman (1989) and is predicted to yield 8–38% extra genetic gain (Meuwissen and Goddard 1996). However, the usefulness of information from a sparse marker map in outbreeding species is limited because the linkage phase between a marker and QTL must be established for every family in which the marker is to be used for selection. The total number of single nucleotide polymorphisms (SNP) is estimated at many millions (Halushka et al. 1999), and the advent of DNA chip technology may make genotyping of many animals for many of these

markers feasible (and perhaps even cost effective). However, the precision of mapping QTL by traditional linkage analysis is little improved by the use of a very dense marker map (Darvasi et al. 1993). Therefore, a different approach is needed to efficiently use all this marker information. With a dense marker map some markers will be very close to the QTL and probably in linkage disequilibrium with it (e.g., Hastbacka et al. 1992). Therefore, some marker alleles will be correlated with positive effects on the quantitative trait across all families and can be used for selection without the need to establish linkage phase in each family. Close markers can be combined into a haplotype. Chromosome segments that contain the same rare marker haplotypes are likely to be identical by descent (IBD) and hence carry the same QTL allele. Our approach is to estimate the effect on the quantitative trait of small chromosome segments defined by the haplotypes of marker alleles that they carry. Quantitative traits are usually affected by many genes and consequently the benefit from marker-assisted selection is limited by the proportion of the genetic variance explained by the QTL. It would be desirable to utilize all QTL affecting the trait in marker-assisted selection. However, a dense marker map defines a very large number of chromosome segments and so there will be many effects to be estimated, probably more than there are phenotypic data points from which to estimate them. The problem is essentially the same if we assume that

Application Thousand of markers + Prediction equations

Why EBV (DYD) is the phenotype? P = µ + A + D + I + PE + TE G

Corresponding author: Theo Meuwissen, Department of Animal Breeding and Genetics, DLO-Institute for Animal Science and Health, Box 65, 8200 AB Lelystad, The Netherlands. E-mail: [email protected]

E

157: 1819–1829 (April 2001) PGenetics = Fenotipo µ = Fattore comune agli individui (es. managment) A = effetti genetici additivi D = effetti genetici di dominanza + = I = effetti genetici di interazione

PE= effetti ambientali permanenti TE = effetti ambientali temporanei

Prediction equations

GEBV

Phen / Gen Variability PAT.

M 1 M2

M3 M4 Q1 M5

M6 M7 Q2 M8 .... q/Qn… m/Mn

MAT.

EBV => Milk Kg = 2340 Protein % +0.12 SCC = 95 m1 m2 m3 m4 Q1 m5 m6 m7 Q2 m8 .... q/Qn… m/Mn

PAT.

M 1 M2

1

M3 m4 q1 M5

M6 M7 Q2 M8 … q/Qn … m/Mn

EBV => Milk Kg = 1870 Protein % +0.23 SCC = 103 MAT. m1 M2 M3 m4 q1 m5 m6 m7 q2 m8 ... q/Qn … m/Mn ... … … … … … … … … … … … … … … … … … … … … … … ... … … … … … … … … … … … … … … … … … … … … … … PAT. m1 M2 M3 m4 q1 M5 M6 M7 q2 M8 … q/Qn … m/Mn 2

+

Gestione

Casualità Genetica

n

MAT.

EBV => Milk Kg = 2140 Protein % -0.03 SCC = 101 m1 m2 M3 M4 Q1 m5 M6 M7 q2 m8 ... q/Qn … m/Mn

10

Most markers will have no effect

Es. Marker 10,720 Genotype

N. Individuals

Phenotype

M10,720M10,720

38

2019

M10,720m10,720

545

1780

0.0

0.0

0.0

0.0

0.0

M10,720m10,720

319

1689

0.0 +0.2 0.0

0.0

0.0

0.0 +0.03 0.0

0.0 2100

0.0

0.0

0.0

0.0

+0.5 0.0

0.0 -0.07 0.0 +0.15

0.0

0.0

Assume all errors cancel each other out

Phenotype Means + BV

2000

Dom Dev

1900

BV

1800

Mean Alpha = 141.7

1700 1600 0 mm

1 Mm

+0.2 0.0

0.0

-0.3

0.0

0.0 +0.35 0.0

0.0 -0.05 -0.01 0.0

0.0

0.0

0.0

0.0

0.0

0.0

Protein Content (%) = + 0.12 Longevity (years) = + 0.34 And so on … = ?????

0.0

-0.1

0.0

0.0

2 MM

Number of M alleles in the genotype

Direct Genomic EBV •  Training (single marker analysis)

y = µ1n + Xi gi + e

'QTL Mapping, MAS, and Genomic Selection' Accuracy of GEBV Course Notes

•  Factors influencing the accuracy of genomic selection i.e. r(DGEBV,TBV)

Taught by Dr. Ben Hayes, Animal Genetics and Genomics group of the Department of Primary Industries R

–  Linkage Disequilibrium between QTL and Markers

Accurac

•  Density of markers => wholeselection genome Accuracy of genomic

•  Test population n

DGEBV = ∑ xij gˆ j i

•  Where –  xi = is the marker genotype of individual j –  gj = is the estimated effect of marker j

–  Method of estimation

• Factors affecting accuracy of •  Single marker / Haplotypes / IBD genomic selection r(GEBV,TBV)

•  BLUB / Bayes etc. – Linkage disequilibrium between QTL –  Number of records to estimate prediction and markers = density of markers

• Factors genomic

– Linkage and ma

equations, i.e.orn.single individuals insufficient training • Haplotypes markers be in LD with the QTL such that the haplotype or single population markers will predict the effects of the QTL

• Haplot with th marke across • Calus e effect accura

across the population.

'QTL Mapping, MAS, and Genomic Selection' Course Notes

and Genomics group of the Department of Primary Industries Research Victoria (Attwood - Melbourne, Australia)

uracy of GEBV,TBV)

Accuracy of genomic selection • Factors affecting accuracy of genomic selection r(GEBV,TBV) – Linkage disequilibrium between QTL and markers = density of markers

rs be in sufficient LD haplotype or single ects of the QTL

• Haplotypes or single markers be in sufficient LD with the QTL such that the haplotype or single markers will predict the effects of the QTL across the population. • Calus et al. (2007) used simulation to assess effect of LD between QTL and markers on accuracy of genomic selection

Accurac

• Factors a selection

– Linkage markers – In dairy of 0.2 b achieved 100kb.

1 0.95

HAP_IBS

0.9 0.85 0.8 0.75 0.7 0.65 0.6

2

m between QTL y of markers

Accuracy of genomic selection • Effect of LD on accuracy of selection

Accuracy of GEBV

ic selection

0.55 0.5 0.075

0.095

0.115

0.135

0.155

0.175

0.195

0.215

2

Average r between adjacent marker loci

Courtesy of Ben Hayes, 2008

ic selection

acy of

Courtesy of Ben Hayes, 2008

Accuracy of genomic selection • Factors affecting accuracy of genomic selection r(GEBV,TBV) – Linkage disequilibrium between QTL and markers = density of markers – In dairy cattle populations, an average r2 of 0.2 between adjacent markers is only achieved when markers are spaced every 100kb. 0.8

Australian Holstein 0.7

Norwegian Red Australian Angus

0.6

Average r2 value

New Zealand Jersey Dutch Holsteins

0.5 0.4 0.3 0.2

0.175

arker loci

0.195

0.215

0.1 0 0

100

200

300

400

500

600

700

800

Distance (kb)

900

1000 1100 1200 1300 1400 1500

Accuracy of genomic selection

Accurac

• Factors affecting accuracy of genomic selection r(GEBV,TBV)

• Compar genomic

– Linkage disequilibrium between QTL and markers = density of markers – In dairy cattle populations, an average r2 of 0.2 between adjacent markers is only achieved when markers are spaced every 100kb. – Bovine genome is approximately 3000000kb – Implies that in order of 30 000 markers are required for genomic selection to achieve accuracies of 0.8!!

11

– IBD ap – haploty – single m – Calus e data

Accuracy of genomic selection

lection

Taught by Dr. Ben Hayes, Animal Genetics and Genomics group of the Department of Primary Industries Research Victoria (Attwood - Melbourne, Australia)

y of V,TBV)

selection

• Factors affecting accuracy of genomic selection r(GEBV,TBV)

0.95

– Linkage disequilibrium between QTL

ween QTL markers

1 0.95

Accuracy of genomic selection

Accuracy of GEBV

Accuracy of GEBV

• Haplotypes or single markers be in sufficient LD with the QTL such that the haplotype or single markers will predict the effects of the QTL across the population. • Calus et al. (2007) used simulation to assess effect of LD between QTL and markers on SNP1 accuracy of genomic selection HAP_IBS

0.9

0.7

0.6

0.55

– Chromosome segment effects gi 0.5 0.075 0.095 0.115 0.135 0.155 0.175 0.195 0.215 estimated in a reference population Average r between adjacent marker loci – How big does this reference population need to be? – Meuwissen et al. (2001) evaluated accuracy using LS, BLUP, BayesB using 500, 1000 2000 records in Accuracy oforgenomic selection the reference population • Factors affecting accuracy of genomic selection r(GEBV,TBV)

HAP_IBD

0.85

2

0.8 0.75 0.7

0.6

Accuracy of genomic selection 0.5 0.075

0.095

0.115

0.135

0.155

0.175

0.195

0.215

2

Average r between adjacent marker loci

• Factors affecting accuracy of genomic selection r(GEBV,TBV)

estimate chromosome segment effects 0.8

Australian Holstein

0.7

Norwegian Red

Australian Angus

0.6

Average r2 value

New Zealand Jersey Dutch Holsteins

0.5 0.4

No. of phenotypic records

0.3 0.2

0.215

0.1

Accu

• Com geno

– Linkage disequilibrium between QTL and markers = density of markers – In dairy cattle populations, an average r2 Accuracy ofbetween genomic selection of 0.2 adjacent markers is only achieved when markers are spaced every 100kb. • Number of records used to – Bovine genome is approximately estimate chromosome segment effects3000000kb – Implies that in order of 30 000 markers No. of phenotypic to are required for genomic selection records achieve accuracies of 0.8!!

– Linkage disequilibrium between QTL and markers = density of markers – In dairy cattle populations, an average r2 Accuracy of genomic selection of 0.2 between adjacent markers is only achieved when markers are spaced every • Number of records used to 100kb.

0.195

0.8

0.75

0.65

0.55

of

0.85

• Number of records used to estimate chromosome segment effects

0.65

lection

– Lin ma – In of 0 ach 100

HAP_IBS

0.9

and markers = density ofselection markers Accuracy of genomic

sufficient LD pe or single the QTL

selec

1

– IBD – ha – sin – Ca da

0 0

100

200

300

400

500

600

700

800

900

1000 1100 1200 1300 1400 1500

500

Distance (kb)

Courtesy of Ben Hayes, 2008 Least squares Best linear unbiased prediction (BLUP) BayesB

lection

1000

2200

500

of Ben Hayes, 2008 LeastCourtesy squares Best linear unbiased prediction (BLUP) BayesB

0.124 0.204 0.318 0.579 0.659 0.732 0.708 0.787 0.848

Accuracy of genomic selection

QTL and

4/15

– IBD approach – haplotypes – single markers Accuracy of genomic selection – Calus et al (2007) used simulated • Number of records used to estimate chromosome data segment effects

average r2 rs is only aced every

y

Technology and information Genomic selection • An IBD approach • Factors affecting accuracy of genomic selection • How often to re-estimate the chromosome segment effects? • Non-additive effects • Genomic selection with low marker density • Genomic selection across breeds • Cost effective genomic selection • Optimal breeding program design with genomic selection

21000

Number of phenotypic records necessary to achive this accuracy

markers ion to

20000

Accuracy of GEBV 0.7 Accuracy of GEBV 0.5

19000 18000

2200

March 10-14, 2008 The Animal Breeding and Genomics Centre (Animal Sciences Group - Wageningen University a h2=0.5

• Comparing the accuracy of genomic selection with

genomic

1000

0.124 0.204 0.318 0.579 0.659 0.732 0.708 0.787 0.848

17000 16000 15000 14000 13000 12000 11000 10000 9000 8000 7000 6000 5000 4000

March 10-14, 2008 (Animal Sciences Group - Wageningen University and Research Centre), Lelystad, The Netherlands 3000 2000 1000 0

0

0.1

0.2

0.3

0.4

4/15

0.5

0.6

0.7

0.8

0.9

1

Heritability

Courtesy of Ben Hayes, 2008

March 10-14, 2008 The Animal Breeding and Genomics Centre (Animal Sciences Group - Wageningen University and Research Centre), Lelystad, The Netherlands 5/15

Why Genomic Selection?

The SNP Genotyping technology

•  Where traits measurements are: –  –  –  –  – 

Expensive Difficult Only measured in one sex Only possible late in life Only possible on relatives of the animals available for selection

•  For traits such as: –  –  –  –  – 

Product composition Fertility Longevity Meat quality Disease resistance

•  Low heritability traits –  i.e. low signal:noise ratio

•  Where inheritance is complex •  Where incidence is sporadic

12

Why Genomic Selection? •  Higher reliability of proofs (EBV) –  Possibility to use a group of unproven bulls –  Better choice of bull dams

Key Questions about GS •  Training –  How many animals? –  How many markers? –  How many generations?

•  Application

•  Faster generation interval –  Use of young bulls

•  Lower progeny testing costs –  Smaller groups of young bulls to progeny testing –  Reduction in generation interval

–  How accurate? –  How stable over time? –  How robust across populations and environments? •  E.g. Genotype x Environment interaction

From Muir et al. 2010 Interbull Bulletin 41

How can we improve genomic selection???

Razza Frisona Italiana Schema di selezione

•  Caseine

Top 5% (T+G)

Progeny test(T) 23% Pop. Reg

1 ml vacche registrate 500,000 non registrate 400(T) 100(G) Giovani Tori

–  Da circa 15 anni geni delle caseine isolati e sequenziati interamente - scoperta di nuove varianti

•  β lattoglobulina –  Almeno 6 varianti genetiche

Top 2% madri di toro ANAFI Centro Genetico

Alcuni marcatori in zootecnia

Top 1% tori italiani e Top 1% internazionali

500(T) 2000(G) giovani tori

Alcuni marcatori in zootecnia •  DUMPS (Deficency of Uridrine Monophosphate Synthase) –  Solo nella Holstein •  Portatori sani producono più latte •  In omozigosi aumento delle mortalità nei primi due mesi di gestazione

•  WEAVER (Mioencefalopatia degenerativa progressiva) –  Solo nella razza Bruna •  Portatori sani producono più latte (700 kg anno) •  Gene non ancora isolato ma identificato marcatore

•  Effetto sul tasso di caseine nel latte e sulla trasformazione in formaggio

•  BLAD (Bovine Leukocyte Adhesion Deficency) –  Mutazione > 2% soggetti di razza Holstein

Alcuni marcatori in zootecnia •  PSS-MH (Porcine Stress Sindrome Malignant Hyperthermia) –  Compromette l'utilizzo della carne da parte dell'industria di trasformazione –  Gene identificato pochi anni fa

•  MH (ipertrofia muscolare) •  Boorola –  Razza Merino australiana –  Omozigoti FF numero di nati doppio

13

Diagnosi con marcatori •  Es. “Complex Vertebral Malformation” •  Patologia a carattere recessivo individuata di recente •  Carlin-M Ivanhoe Bell portatore e padre di numerosi tori in riproduzione •  Danno per la selezione

14

Suggest Documents