Development of tools and strategies towards marker assisted selection and gene cloning. Bart Brugmans

Development of tools and strategies towards marker assisted selection and gene cloning. Bart Brugmans Promotor: Prof.dr R.G.F. Visser Hoogleraar in...
5 downloads 0 Views 2MB Size
Development of tools and strategies towards marker assisted selection and gene cloning.

Bart Brugmans

Promotor: Prof.dr R.G.F. Visser Hoogleraar in de Plantenveredeling Copromotor: Dr. ir.H.J. van Eck Universitair docent, Laboratorium voor plantenveredeling Promotiecommissie: Dr J. Peleman (Keygene, Wageningen) Prof. Dr. Ir. P. Stam (Wageningen Universiteit) Prof. Dr. A.G.M. Gerats (Radboud Universiteit Nijmegen) Prof. Dr. M. Groenen (Wageningen Universiteit) Dit onderzoek is uitgevoerd binnen de onderzoeksschool 'Experimental Plant Sciences'.

Development of tools and strategies towards marker assisted selection and gene cloning.

Bart Brugmans

Proefschrift ter verkrijging van de graad van doctor op gezag van de rector magnificus van Wageningen Universiteit, Prof. dr. M.J. Kropff, in het openbaar te verdedigen op vrijdag 23 september 2005 des namiddags te vier uur in de Aula.

CIP-DATA Koninklijke Bibliotheek, Den Haag Development of tools and strategies towards marker assisted selection and gene cloning. Brugmans B. Thesis Wageningen University, The Netherlands With references - with summaries in Dutch and English ISBN 90-8504-227-5

Contents

Chapter 1

General introduction

Chapter 2

Exploitation of a marker dense linkage map of potato for positional cloning of a wart disease resistance gene

17

Chapter 3

A novel method for the construction of genome wide transcriptome maps Genetic mapping and expression analyses of resistance gene loci in potato using NBS-profiling

37

Chapter 4

7

59

Chapter 5

A new and versatile method for the successful conversion of AFLP markers into simple single locus markers

79

Chapter 6

Summary and concluding remarks

97

Samenvatting

108

Nawoord

112

Curriculum vitae

115

List of Publications

117

Chapter 1 General introduction

7

Chapter 1

8

General introduction

General Introduction History of the genetics of linkage mapping Before the advent of markers for breeding, there had to be a general understanding about genes and the inheritance of genes from parents to progeny. This understanding of genetics started with the definition of a gene by Mendel more than a century ago. Summarized in his two laws, a gene was recognized as a “particulate factor” that passes unchanged from parent to progeny (Mendel, 1875). Similar as for genes one copy of each chromosome is inherited by the progeny from each parent. This equivalence led to the discovery that chromosomes in fact carry the genes (Bateson, 1905). The next step was the demonstration that each chromosome consist of a linear array of genes (Sturtevant, 1913). Mendel’s laws predicted that genes located on different chromosomes will segregate independently while genes that are on the same chromosome show linked inheritance. Genetics has provided plant breeding with a scientific basis and allowed the development of rational selection methods to complement the intuitive selection through the breeder’s eye. Additionally, genetics gave a better understanding of the relation between the phenotype and the underlying genotype. When dealing with a qualitative character, the relation between the phenotype and the genotype of the parents is easily recognized from the simple numerical ratios observed in the segregating progeny, as pointed out by Mendel. For most agronomically important traits the genetic variation shows a quantitative pattern and qualitative inheritance may even be considered as exceptional. Quantitative traits cannot be described in discrete phenotypic classes, but are described through the trait values of single individuals, which are conceived as a-select drawings from a continuous distribution. To detect the gene or genes involved in a trait, to study the influences of the gene in an other genetic background and to finally clone and sequence the gene, molecular markers are an important tool to detect the presence or absence of the gene independent of the phenotype or environment. Molecular markers Of all molecular markers, iso-enzymes can be considered as the first example (Markert and Moller, 1959). These markers have a neutral phenotype, which implies that they can be used to predict phenotypic variation, without making a contribution to the phenotype themselves. The major drawback of iso-enzymes was the limited number of polymorphic loci that could be detected encouraging the search for a marker type that would supply breeders and researchers with more markers. The discovery of restriction enzymes (Smith & Wilcox, 1970) formed the basis of a new class of markers, which are based upon the

9

Chapter 1

presence of DNA sequence variation. This DNA sequence variation is monitored as changes in the length of DNA fragments produced by restriction endonucleases. The method has, therefore, been termed ‘Restriction Fragment Length Polymorphisms’ (RFLP) (Grodzicker et al, 1974; Botstein et al. 1980). The requirement of relative large amounts of pure DNA, the lack of polymorphism revealed for some species and the time consuming and laborious nature of RFLP analyses have prompted the search for more efficient marker systems. After the development of the polymerase chain reaction (PCR) (Mullis & Faloona, 1987) a wide variety of PCR based techniques such as simple sequence repeat polymorphism (SSR) (Tautz, 1989), random amplified polymorphic DNA markers (RAPD) (Welsh and McClelland 1990; Williams et al. 1990) and Amplified Fragment Length Polymorphism (AFLP, Vos et al. 1995) were developed. All these techniques rely on the exponential amplification of a specific stretch of DNA by using specific primers exploiting the advantage of PCR that only a small amount of DNA per reaction is needed. For RAPDs as well as AFLP no prior sequence information is required, making them the most suitable for generation of markers in a species of which no or little sequence information is known. The RAPD method is based on the amplification of fragments using short arbitrary primers making this technique easy to perform. However, the drawback is the doubtful reproducibility between laboratories and PCR-machines. This in contrast with the AFLPR technology, which is very robust and highly reproducible (Jones et al. 1997). The AFLP assay is using a combination of the principles of the RFLP and the RAPD technique. Like RFLP fragment size is determined by restriction size and like RAPD a subset of amplicons is selected by specific primers. For AFLP the DNA is cut using restriction enzymes and adapter oligonucleotides are ligated to the overhanging ends. These adapters serve as a unique and uniform annealing site for PCR primers allowing stringent annealing temperatures during the PCR cycle. The complexity of the mixture of fragments is decreased using selective PCR amplification by adding selective nucleotides to the primers. Because many different combinations of enzymes or selective nucleotides can be used, AFLP provides a virtual unlimited amount of fragments. The robustness induced by the restriction enzyme and adapter ligation (anchor) and the applicability in a wide variety of species has made the AFLP technique one of the most common techniques for marker detection and has resulted in a variety of anchor related techniques. For example TE-AFLP which uses three restriction enzymes in stead of two to reduce the complexity, but still uses two primers with selective nucleotides like AFLP (van der Wurff et al. 2000). A second application is the generation of fingerprints using anchor PCR without selective nucleotide from approximately 100 kb DNA fragments (Vos et al, 1995) isolated from bacterial artificial chromosome (BAC) libraries. Due to the identification of conserved DNA sequences domains it became possible to develop domain directed fingerprinting techniques, which use a specific sequence domain based primer replacing

10

General introduction

one of the primers with selective nucleotides to amplify a subset of the genome. Examples are SAMPL (Witsenboer et al. 1997) which combines primers based on SSR motifs with AFLP, SSAP (Waugh et al. 1997) which uses a primer based upon the BARE-1 motif and NBS-profiling (Hayes and Saghai Maroof, 2000; van der Linden et al. 2004) which uses primers based upon the conserved domains of the NBS motif in combination with AFLP. The cloning and sequencing of a gene To study the molecular structure of a gene in more detail, the gene has to become cloned which is performed in combination with a large insert library which most commonly is a bacteria artificial chromosome (BAC) library (Shizuya et al. 1992). This library contains fragments with a length between 80 and 300 Kb of the genome of interest which can be multiplied in an efficient way to provide material for research. The complete BAC library contains several copies of the genome out of which the BAC clone containing the gene of interest has to be extracted. To locate this BAC, the genetically most closely linked markers are used to perform chromosome landing (Tanksley et al. 1995). The BACs indicated by these flanking markers are used as a starting point for BAC walking by converting the BAC ends into new markers and detect new BACs. Finally an overlapping set of BAC clones (contig) enclosing the gene of interest will be assembled. Subsequently, candidate genes can be isolated from these BAC clones. Evidence that the target gene is cloned is obtained when transformation of the candidate gene into a recessive genotype results in the complementation of the phenotype. This so called map based cloning procedure has proven to be successful for the isolation of resistance genes in several plant species such as tomato (Martin et al. 1993; Dixon et al. 1996), Arabidopsis (Bent et al. 1994; Mindrinos et al. 1994; Grant et al. 1995), sugar beet (Cai et al. 1997) and barley (Büschges et al. 1997). The success rate and amount of work needed to construct a contig enclosing the gene mainly depends on the physical distance between the flanking markers and the gene. To ensure a minimal genetic distance, a screening is performed upon a large mapping population to detect recombinants between the markers and the trait locus. When the genetic distance is less than 1 cM the markers can be useful for BAC landing. If the physical distance is too large, in spite of a small genetic distance, an additional round of marker saturation has to be performed until flanking markers for BAC landing have been found. Current limitations in marker assisted breeding and map based cloning At the start of the thesis research, a number of technical issues were perceived that determine the efficiency of marker assisted breeding (MAB) and map based gene cloning. The first technical issue is related to the complexity of the multi locus fingerprinting technique AFLP. In many applications such as MAB and gene cloning, simple single locus

11

Chapter 1

marker techniques are preferred above the more laborious AFLP technique. Because AFLP markers are very suitable to locate markers linked to a trait of interest it would be very convenient if AFLP markers could be converted into simple PCR markers. The second issue is related to the level of marker saturation. Regions with less than one marker per centimorgan in a recombination hotspot can be ‘more saturated’, i.e. with markers physically more close to a target locus, than a region with 20 markers per centimorgan in a coldspot for recombination. The only method to evaluate the level of marker saturation at this moment is the combination of a genetic map with a physical map. For standard map based cloning projects, large numbers of primer combinations are used to detect markers closely linked to the gene to start BAC landing. This procedure of marker saturation, which is performed for each trait separately, is laborious and time consuming. With the same effort it might be possible to construct a genetic map with genome wide marker saturation suitable for map based cloning leading to the cloning of several genes. The third issue is related to marker development towards coding regions of the genome. This can be subdivided into targeting by implying a technique that is specifically oriented towards coding regions of the genome, but behaves in a random fashion within coding sequences. Alternatively, targeting may also imply a technique that is oriented towards specific motifs (or boxes) inside genes, or motifs shared by a specific gene family. By using anchor related fingerprinting techniques several template/fingerprint combinations can be made to generate markers which are shown in the following table.

Primers

DNA fingerprinting methods

Template from genomic DNA

Template from transcriptome (cDNA)

General Primers

Normal AFLP (Vos et al., 1995) with commonly used primers with random selective nucleotides

cDNA-AFLP (Bachem et al., 1996) With commonly used primers and fewer selective nucleotides

Specific primers

NBS profiling (van der Linden 2004)

NBS transcriptome profiling (new)

Until now, only normal AFLP in combination with DNA as template is used for the generation of genetic markers, however using this combination no specific gene targeting can be performed. The other fingerprinting techniques that do target the coding regions or certain gene specific motifs are only used to generate gene derived fragments using lines or different stages of development of the same plant. If it is possible to generate polymorphic markers in a mapping population using these techniques, the markers can be used as genetic markers for mapping and potentially linked to a gene of interest. Subsequently these markers can be used to perform gene-landing.

12

General introduction

Outline of this thesis In this thesis research is described aiming at alleviation of the perceived limitations in the standard protocol which encompasses: mapping a trait, followed by marker saturation, genetic resolution, and finally BAC landing and walking to span the physical distance between the markers. In Chapter 2 proof of concept is presented that the level of marker saturation offered by the current version of the ultra dense map of potato is sufficient for map based cloning efforts. This is demonstrated by the assembly of a contig enclosing the Sen1-4 gene involved in wart disease resistance. Current linkage maps generated with AFLP show the unpleasant feature of centromeric clustering of genomic markers due to centromeric repression of recombination. Methods to target markers in the heterochromatic parts of the genome and to avoid the centromeric euchromating depend on e.g. methylation. AFLP markers based on the methylation sensitive enzyme PstI show a more uniform coverage of the linkage map (Van Os et al., manuscript in preparation). Another method to target the gene rich parts of the genome is the use of cDNA as template for marker development. In the past RFLP probes were derived from PstI digested libraries or cDNA clones. In the era of PCR based marker techniques it is evident that cDNA-AFLP might be a useful method; not only for transcript profiling but also for marker development. Furthermore this may also help to circumvent the time consuming BAC walking, when gene landing in stead of BAC landing might be an option. To enable gene landing, a marker has to be derived from the gene itself. When cDNA-AFLP patterns are generated from a series of offspring genotypes from a mapping population, then the polymorphisms in these cDNA-AFLP fingerprints should serve as a genetic marker of a specific chromosomal position. In Chapter 3 markers are generated in a mapping population using the cDNA-AFLP technique to evaluate and test the applicability of these markers. AFLP fingerprinting, both with genomic DNA as well as cDNA as the source material for template production, is a random technique. It does not allow specific targeting of gene rich regions in general or certain DNA sequences in particular. However, at this moment novel combinations of selective AFLP primers and specific primers allow us to generate fingerprints with amplification products that represent sequence tags of specific classes of genes. One of these AFLP-derived gene-family-specific fingerprinting techniques is called NBS profiling (van der Linden et al. 2004). This enables the specific amplification of parts of resistance genes or resistance gene analogs. In Chapter 4 NBS profiling is performed on templates prepared from DNA isolated from a mapping population to evaluate the potential to genetically map resistance gene clusters. In addition, cDNA can be used to evaluate the expression of R-genes between tissues in combination with NBS-profiling.

13

Chapter 1

AFLP and anchor related markers are rather laborious and expensive because several enzymes and amplification steps are needed before a product is ready for separation. This has to be performed on a polyacrylamide gel based system to create the resolution that is needed for accurate separation of the fragments. This time consuming and expensive protocol make anchor related techniques not suitable for recombinant screenings or marker assisted selection (MAS). Moreover, the required equipment is often unavailable to breeding companies. Therefore, a protocol for the efficient conversion of AFLP markers into simple PCR markers was developed (Chapter 5). Finally in chapter 6 the potential of the different methods to improve the efficiency for gene mapping and gene cloning is discussed.

14

General introduction

References Bachem C, van der Hoeven R, de Bruijn M, Vreugdenhil D, Zabeau M, Visser R (1996) Visualization of differential gene expression using a novel method of RNA fingerprinting based on AFLP: Analysis of gene expression during potato tuber development. Plant Journal 9: 745-753 Bateson W (1905). In a letter to A. Sedgewick, from W. Bateson (1928) Essays and addresses, edited B. Bateson, Cambridge University Press, Cambridge. Botstein D, White RL, Skolnick M, Davis RW (1980). Construction of a genetic linkage map in man using restriction fragment length polymorphisms. Am J. hum. Genet. 32:314-331. Büschges R, Hollricher K, Panstruga R, Simons G, Wolters M, Frijters A, van Daelen R, van der Lee T, Diergaarde P, Groenendijk J, Töpsch S, Vos P, Salamini F, Schultze-Lefert P (1997) The barley Mlo gene: a novel control element of plant pathogen resistance. Cell 88:695-705 Cai DG, Kleine M, Kifle S, Harloff HJ, Sandal NN, Marcker KA, Klein-Lankhorst R, Salentijn EMJ, Lange W, Stiekema WL, Wyss U, Grundler FMW, Jung C (1997) Positional cloning of a gene for nematode resistance in sugar beet. Science 275:832-834 Creighton H, McClintock B (1931). A correlation of cytological and genetical crossing-over in Zea mays. PNAS 17:492-497. Dixon MS, Jones DA, Keddie JS, Thomas CM, Harrison K, Jones JDG (1996) The tomato Cf-2 disease resistance locus comprises two functional genes encoding leucine-rich repeat proteins. Cell 84:451-459 Grant MR, Godiard L, Straube E, Ashfield T, Lewald J, Sattler A, Innes RW, Dangl JL (1995) Structure of the Arabidopsis RPM1 gene enabling dual specificity disease resistance. Science 269:843-846 Grodzicker T, Williams J, Sharp P, Sambrook J (1974). Physical mapping of temperature-sensitive mutations of adenoviruses. Cold Spring Harbor Symp. Quant. Biol. 39:439-446. Hayes AJ, Saghai Maroof MA (2000) Targeted resistance gene mapping in soybean using modified AFLPs. Theor. Appl. Genet. 100: 1279-1283 Jones C, Edwards K, Castaglione S, Winfield M, Sala F, van der Wiel C, Bettini P, Buiatti M, Maestri E, Malcevschi A, Marmiroli N, Aert R, Volckaert G, Rueda J, Linacero R, Vazquez A, Karp A (1997). Reproducibility testing of RAPD, AFLP and SSR markers in plants by a network of European laboratories. Mol. Breed. 3:381-390 Markert CL, Moller F (1959). Multiple forms of enzymes: Tissue, ontogenetic, and species-specific patterns. Proc. Nat. Acad. Sci. US45: 753-763. Martin GB, Brommonschenkel SH, Chunwongse J, Frary A, Ganal MW, Spivey R, Wu T, Earle ED, Tanksley SD (1993) Map-based cloning of a protein kinase gene conferring disease resistance in tomato. Science 262:1432-1436 Mendel JG (1875) Versuche über Pflanzenhybriden. Verh. Naturf. Ver. Brünn 4: (abh.). Mindrinos M, Katagiri F, Yu GL, Ausubel FM (1994) The Arabidopsis thaliana disease resistance gene RPS2 encodes a protein containing a nucleotide-binding site and leucine-rich repeats. Cell 78:1089-1099 Mullis KB, Faloona FA (1987) Specific synthesis of DNA in vitro via the polymerase catalysed reaction. Meh Enzymol 255:335-350. Shizuya, H., B. Birren, U.J. Kim et al. 1992. Cloning and stable maintenance of 300-kilobase-pair fragments of human DNA in Escherichia coli using an F-factor-based vector. Proc. Nat. Acad. of Sci 89: 8794-8797 Smith HO, Wilcox KW (1970) A restriction enzyme from Hemophilus influenzae. 1. Purification and general properties. J. Mol. Biol. 51:379-392. Sturtevant AH (1913). The linear arrangement of six sex-linked factors in Drosophila, as shown by their mode of association. J. of exp. Zoology 14:43-59.

15

Chapter 1

Tanksley SD, Ganal MW, Martin GD (1995) Chromosome landing: a paradigm for map-based gene cloning in plants with large genomes Trends Genet. 11: 63-68 Tautz D (1989) Hypervariability of simple sequences as a general source for polymorphic DNA markers. Nucleic Acids Res 17: 6463-6471 van der Linden G, Wouters D, Mihalka V, Kochieva E, Smulders M, Vosman B (2004) Efficient targeting of plant disease resistance loci using NBS profiling Theor Appl Genet 109 : 384–393 van der Wurff AWG, Chan YL, van Straalen NM, Schouten J (2000). TE-AFLP: combining rapidity and robustness in DNA fingerprinting. Nucl. Acid. Res. 28: e105 Vos P, Hogers R, Bleeker M, Reijans M, van der Lee T, Hornes M, Frijters A, Pot J, Peleman J, Kuiper M, Zabeau M 1995. AFLP: a new technique for DNA fingerprinting. Nucl Acids Res 23(21): 4407-4414. Welsh J, McClelland M (1990) Fingerprinting genomes using PCR with arbitrary primers. Nucl. Acid. Res. 18:7213-7218. Waugh R, McLean K, Flavel AJ, Pearce SR, Kumar A, Thomas BBT, Powell W (1997). Genetic distribution of Bare-1-like retrotransposable elements in the barley genome revealed by sequence-specific amplification polymorphisms (SSAP). Mol. Gen. Genet. 253:687-694. Williams JGK, Kubelik AR, Livak KJ, Rafalski JA, Tingey SV (1990). DNA polymorphisms amplified by arbitrary primers are useful genetic markers. Nucl. Acid Res. 18:6531-6535. Witsenboer H, Vogel J, Michelmore RW (1997). Identification, genetic localisation and allelic diversity of amplified microsatellite polymorphic loci in lettuce and wild relatives (Lactuca ssp.): Genome. 40:923– 936.

16

Chapter 2 Exploitation of a marker dense linkage map of potato for positional cloning of a wart disease resistance gene Bart Brugmans, Ronald G.B. Hutten, A. Nico O. Rookmaker, Richard G.F. Visser, Herman J. van Eck Accepted by Theoretical and Applied Genetics

17

Chapter 2

18

Positional cloning of a wart disease resistance gene

Exploitation of a marker dense linkage map of potato for positional cloning of a wart disease resistance gene Abstract A marker saturated linkage map of potato was used to genetically map a locus involved in resistance against wart disease Synchitrium endobioticum race 1. The locus mapped on the long arm of chromosome 4 and is named Sen1-4 in contrast to a Sen1 locus on chromosome 11. The AFLP markers from the Sen1-4 interval enabled the isolation of BAC clones from an 11 genome equivalent BAC library. This was achieved via fingerprinting of BAC pools with the AFLP primer pairs that resemble to the genetic marker loci. With non-selective AFLP primers fingerprints were generated of individual BAC clones to analyse the overlap between BAC clones using FPC. This resulted in a complete contig and a minimal tiling path of 14 BAC clones enclosing the Sen1-4 locus. The BAC contig has a genetic length of ~6 cM and a physical length of ~1 Mb. Our results demonstrate that map based cloning of Sen1-4 can be pursued on the basis of a strategy of marker saturation alone. Genetic resolution achieved by screening large numbers of offspring for recombination events may not be required. Together with the construction of the BAC contig, a physical map with the position of the markers is accomplished in one step. This provides proof of concept for the utility of the marker saturation that is offered by the ultra dense AFLP map of potato for gene cloning. Introduction Map-based cloning has proven to be a successful method for the isolation of e.g. resistance genes in several plant species such as tomato (Martin et al., 1993; Dixon et al., 1996), Arabidopsis (Bent et al., 1994; Mindrinos et al., 1994; Grant et al., 1995), sugar beet (Cai et al., 1997) and barley (Büschges et al., 1997). Most map-based cloning projects start with a Bulked Segregant Analysis (BSA) (Michelmore et al., 1991) to find bulk specific markers. Subsequently, these markers are mapped in a segregating population to determine the order and distance relative to the trait of interest. Using a graphical display of the marker scores in the offspring genotypes, new pools can be composed using plants with recombination events in the proximity of the target locus for a second cycle of BSA. This second BSA will narrow the window to be saturated, and thus improve the chance to identify of markers that are physically sufficient close to the gene of interest to allow chromosome landing (Tanksley et al., 1995) in order to find the BAC-clone containing the gene of interest. A level of marker saturation which is equivalent to a physical spacing of markers that is less than the average insert size of the genomic library (e.g. BAC or PAC library) that is going to be used, offers an alternative to the screening of

19

Chapter 2

offspring for recombinants. In fact high resolution mapping only results in genetic distances (cM) and not in physical distances (kb). Moreover, it seems that recombination is not a random event. Chromosomal intervals rather show an alternation of hotspots and coldspots for recombination. Therefore, the assumption of random distribution of AFLP markers (in fact the distribution of EcoRI sites) is more safe than the assumption of random distribution of recombination events. The process of a targeted saturation with markers for BAC landing and/or contig construction is commonly performed for each map based cloning project separately. This is laborious and time consuming in view of the required efforts and resources. The combination of BSA and AFLP fingerprinting (Vos et al., 1995) is an efficient approach to scan thousands of loci. Nevertheless, the numbers of AFLP primer combinations employed for a BSA is usually more than 1000 combinations (Büsches et al., 1997, Harkins et al., 1998). Subsequently, a sufficiently large number of bulk specific, putatively linked markers have to be tested using a mapping population. An alternative for the former approach could be a once and for all, genome-wide saturation with markers loci, that are ready to be used for map-based cloning. In the case the locus specific AFLP technique (Rouppe van der Voort et al., 1997) is used to generate marker loci, then the same selective AFLP primers can be employed to recognise the AFLP amplification products within fingerprints of (pools of) BAC clones, without the need to store probes or locus specific PCR primer information. To construct a genome-wide saturated marker map of the potato genome, and to avoid many separate locus specific marker saturation efforts with BSA, an EU-project was started (FAIR5-PL97-3565; Isidore et al., 2003; http://potatodbase.dpw.wau.nl/UHDdata.html), aiming to reach the required level of marker saturation. It was assumed, that a single linkage map with >10.000 AFLP marker loci, should offer such a level of saturation of the 850Mb potato genome that BAC landing becomes feasible with markers within a BAC length distance of the target gene. Therefore, in potato a diploid mapping population was established (Rouppe van der Voort et al., 1997) in which several interesting traits and genes are segregating, including a locus involved in resistance against Synchytrium endobioticum. This population with >10,000 marker loci should be an ideal starting point for map-based cloning of this resistance gene. The fungus Synchytrium endobioticum, the causal agent of potato wart disease, is an obligate soil-borne pathogen, producing persistent resting spores that can survive for many years (Hampson 1996). Salaman and Lesley (1923) already found that immunity against S. endobioticum was dominant and suggested that two dominant genes could induce immunity independently. The research of Lunden and Jørstad (1934) and Maris (1973) subscribe the model of dominant genes that are inherited in a Mendelian manner, but also reported influences of minor genes that modify the activity of the resistance

20

Positional cloning of a wart disease resistance gene

genes. Hehl et al. (1999) reported the genetic mapping of a single dominant resistance gene against S. endobioticum race 1 (Sen1) on chromosome XI, using DNA sequence homology between the Sen1 locus and the N gene for resistance to TMV. In this paper we report the mapping of Sen1-4, a gene on chromosome IV that confers resistance to race 1 of S.endobioticum. The position of Sen1-4 on the ultra dense map of potato indicates a number of co-segregating AFLP markers, which have been used to identify the corresponding BACs from a BAC library. Based on the resulting contig containing the Sen1-4 locus, the general usefulness of the ultra dense potato map for map-based cloning is discussed. Materials and methods Plant material The F1 mapping population descended from a cross between the diploid parents SH83-92-488 and RH89-039-16 (Rouppe van der Voort et al., 1997, Isidore et al., 2003). The 120 genotypes of this mapping population are maintained in a screen house by annual propagation of seed tubers. This population segregated for resistance to S. endobioticum race 1. The maternal clone SH83-92-488 is donor of the resistance, the paternal clone is susceptible. Wart disease test Genotypes from this mapping population were evaluated for resistance to S. endobioticum using the Spieckermann test (Spieckermann and Kotthoff, 1924). This test was performed within the quarantine facilities of the potato breeding company Averis, Valthermond. Briefly, the Spieckermann bioassay involves inoculation of excised eye pieces from tubers with winter-spores of S. endobioticum, followed by an incubation period of several weeks, after which wart disease symptoms are scored. The test was performed twice using eight eye pieces per genotype, resulting in 16 tested eye pieces. A potato clone was found susceptible if at least one sample expressed clear disease symptoms Genetic mapping of the resistance gene against S. endobioticum race 1 Diploid potato is an obligate out-breeder, and therefore genetic maps are constructed using marker loci that segregate in the F1 progeny. The marker loci used in this study have been obtained in the framework of an EU-project aiming at the construction of an ultra dense map of potato (Isidore et al., 2003; http://potatodbase.dpw.wau.nl/UHDdata.html). The maternal map consists of 4187 segregating marker loci (Aa×aa) that are heterozygous in the pistilate parent SH83-92488. The paternal map has 3431 marker loci (aa×Aa) segregating from the staminate

21

Chapter 2

parent RH89-039-16, and 2765 bridge markers (Aa×Aa) allow to connect the maternal and paternal maps. The co-segregating marker loci are grouped in so-called bins, where each bin differs from its neighbouring bin(s) by one recombination event in the mapping population (Isidore et al., 2003). Within a bin, there is no information about the order of the markers, but linkage phase is known. The 4187 markers of the maternal linkage map of SH83-92-488 reside in 989 bins divided over 12 linkage groups. The segregation for resistance / susceptibility for wart disease was added as a marker locus to the maternal genetic map data, after which the location of Sen1-4 relative to the 989 bins was determined using JoinMap2.0 software (Stam, 1993). BAC-library and Plasmid isolation A HindIII BAC-library of the resistant parent (SH83-92-488) was available (Rouppe van der Voort et al., 1999). This ~11 genome equivalent BAC library comprises almost 98,000 BAC clones with an average insert size of 100Kb. The BAC-library was stored in 255 plates containing 384 BAC-clones per plate. For the identification of the BACs enclosing the Sen1-4 locus, DNA was isolated from plate-pools, row/column pools of single plates and single BAC clones (see pooling stategy below). To prepare plate-pool DNA, the BAC clones of single BAC library plates were stamped on a plate with solid LB medium, using a 384-pin replicator. The bacterial colonies were grown overnight, and washed off using liquid LB medium. Plasmid DNA was isolated from these pooled cells using the alkaline lysis protocol (Heilig et al., 1997). The plasmid DNA isolation of row/column pools and single BAC clones was performed according to the protocol described by Klein et al. (1998). The row/column pools were constructed before plasmid isolation, but after the clones were grown in liquid LB medium (Sambrook et al., 1989). Pooling strategy and library Our BAC-pooling strategy is composed of two steps: (1) combining 255 library plate pools into 64 super pools to identify the library plate containing a BAC of interest. (2) From each positive library plate, the 16 rows and 24 columns were pooled in pairs of two, resulting in eight row pools and 12 column pools, to detect the single BAC of interest (Table 1). DNA of library plate pools was used to generate EcoRI/MseI AFLP template (Vos et al., 1995) using 100 ng plasmid DNA. The super pools were assembled with aliquots of +1/+1 selectively amplified AFLP template of the plate pools. The library plates were arrayed in rectangles of eight rows by eight columns. This was repeated four times to accommodate the 255 library plates. Every row and column was pooled into a super pool, representing (384*8*100kb/840Mb) 0.4 potato genome equivalents of DNA. After diluting the template of the super pools 20 times, the mixture was used for selective amplification using the AFLP primer combinations of the marker loci of the Bins enclosing Sen-1. The

22

Positional cloning of a wart disease resistance gene

putatively positive plate pools from this analysis have been confirmed, using the AFLP template from the single plate pools. The confirmed positive library plates were used to inoculate 384 wells plates with LB medium (Sambrook et al., 1989) to grown single colonies overnight. Approximately 50ng plasmid DNA of these plate specific row and column pools was used for AFLP template preparation. Subsequently, pre-amplification and selective amplification was performed as described for the super pools. This screening resulted in four putatively positive BAC clones at a specific row-column intersection. To find the correct BAC clone, each of the four BACs were tested separately with the AFLP markers. All selective AFLP amplifications were performed using IR-dye labelled EcoRI primers (Biolegio BV, Malden, The Netherlands) after which the products were visualised on a denaturing polyacrylamide gel using a NEN® IR2 DNA analyser (LI-COR® Biosciences, Lincoln, NE). Table 1: Overview of the different levels of pooling of the BAC library for BAC landing with AFLP fingerprinting

Number of library plates per pool

DNA amount (in genome equivalents)

255

11

64

8

0.4

384

255

1

0.05

Row pools

48

8 per plate

1/8

0.006

Column pools

32

12 per plate

1/12

0.004

Pooling level

Number of BAC clones (per pool)

BAC library

97920

Super pools

3072

Plate pools

Number of pools

Contig construction Fingerprint patterns of individual BAC clones were made by non-selective amplification of AFLP template. This template was generated using the enzyme combination HindIII/TaqI, because HindIII was also used for partial digestion of potato genomic DNA for BAC-library construction of clone SH83-92-488. When HindIII is used in combination with a frequent cutter enzyme to construct AFLP template, the outermost restriction fragments of the inserted potato DNA are included and restriction fragments combining potato and vector DNA sequence are excluded. Moreover, those most distal HindIII/TaqI amplicons enable the detection of the smallest possible overlap between the single BACs. HindIII/TaqI template from single BACs was prepared using the standard AFLP protocol (Vos et al., 1995) adjusted for the restriction temperature of TaqI. Fragments were generated using the HindIII and TaqI primers without selective nucleotides after which the generated fragments were separated on a NEN® IR2 DNA

23

Chapter 2

analyser using standard conditions. The generated fragments were scored by the image interpretation software CrossChecker (Buntjer, 2000; http://www.dpw.wau.nl/pv/pub/CrossCheck/download.html). The length of the fragments was estimated using a 10 base ladder which comprehends a fragment each 10 basepair. The data set of AFLP band mobilities was multiplied with ten to produce a functional dataset for FPC (Soderlund et al., 2000) which was used for contig construction of the selected BAC clones. A tolerance of 3 and a cutoff value of 1e-04 were used. Results Genetic localisation of Sen1-4 Tubers from 80 out of 135 individuals of the F1 mapping population were tested for the resistance to S. endobioticum race 1 using the Spiekermann test. The female parent SH83-92-488 was resistant to S. endobioticum, whereas the male parent RH89-039-16 was susceptible. The offspring segregated for resistance in a 35:45 (S:R) ratio, not significantly different from a 1:1 ratio (p=0.26). This suggests a single locus involved in resistance, heterozygous in the diploid female parent SH83-92-488. The phenotypic segregation for resistance was used to map the locus relative to the 989 bins of the ultra dense map parent SH83-92-488. This resulted in the identification of a region on chromosome IV ranging from bin 37 to bin 41 (see http://potatodbase.dpw.wau.nl/locishow.php?lowbin=SH04B037&highbin=SH04B042 ). All genetic positions within this interval have an equal likelihood to accommodate the R-gene due to the absence in the disease test of specific plants that carry a recombination in this interval (Plant #13 recombination defines the border between bin 37 and bin 38, Plant #159 recombination defines the border between bin 38 and bin 39, Plants #164 and 55 define the border between bin 39 and bin 41). In six plants (#57, 27, 34, 48, 88 and 95) the flanking markers did not support the observed resistant phenotype, suggesting escapes. In offspring clone 67 the flanking markers indicated a false susceptible observation. These conflicting observations do not affect the accuracy of mapping, because plants, with false positive or false negative observations, did not originate from a female gamete with a recombined chromosome 4. On the basis of these observations we postulate an R-gene, tentatively called Sen1-4, involved in a monogenic qualitative resistance against S. endobioticum race 1. The name of this locus is an abbreviation of the pathogen, the race, and its chromosomal position. Half tetrad analysis (data not shown) indicates that the region is located on the long arm of chromosome 4 approx. 5 cM below the centromere (bin 31). After removal of the type markers, as well as the PstI/MseI or SacI/MseI markers, thirteen mapped segregating EcoRI/MseI AFLP-markers remained which were used for BAC landing (Table 2).

24

Positional cloning of a wart disease resistance gene

To ensure the construction of the BAC contig beyond the interval another three markers from the proximal bin 35 (EACAMCTC_127, EAAGMCGA_373, EAGCMATG_134) and the only one marker from the distal bin 42 (EAGTMACC_252) were used as well (Table 2). Table 2: The names and genetic position of AFLP at the Sen1-4 locus used for BAC landing, and the number of BAC-clones identified with each marker. The marker indicated with * is illustrated in Fig 1.

Relative to Sen1-4

AFLP-marker

proximal

EACA/MCTC-127

Bin number 35

4

EAGC/MATG-134

5 37

EACA/MAGC-443 EACT/MCCA-122 EAGC/MATG-1000

38

4 5

39

EAGT/MAGG-210* EAAC/MCAT-307

3 4

EAAC/MCTA-278

distal

5

EAAG/MCGA-373 EAAC/MCTA-158 enclosing

positive BACclones

6 4

41

5

EAGG/MCTT-403

5

EAGG/MCAT-70

5

EAGT/MACC-252

42

7

BAC identification The SH83-92-488 BAC library was screened for BACs that carried AFLP markers from the interval comprising Sen-1. To this end 64 super pools were fingerprinted with the thirteen AFLP primer combinations that were used to amplify the marker loci, listed in Table 2. The 0.4 genome equivalent templates showed an average of 40 fragments per lane, upon selective PCR amplification with (Eco/Mse+3/+3) AFLP primers extended with three selective nucleotides (Fig. 1). Deconvolution of super pool fingerprints for the markers of interest indicated a group of putative positive plates. To verify these positive plates, the presence of the AFLP markers was confirmed by fingerprinting the single plate pools with the appropriate AFLP primer combinations. Eventually we obtained for the AFLP markers between three and seven positive plates (Table 2), with an average of 4.7 plates per marker. This is in close agreement with the expected number of five and a half positives when using an 11 genome equivalent BAC library in combination with heterozygous AFLP-markers. The specific plate pools that were retrieved with AFLP

25

Chapter 2

markers are already indicative for the physical distance between certain marker loci. Marker EAACMCAT_307 and EAGTMACC_252 both landed on a BAC in pool 115, suggesting a physical distance of less than 100 kb (the average BAC insert size).

Figure 1: The AFLP fingerprints of eight super pools (lanes 1-8) generated with AFLP primer combination E+AGT and M+AGG. Next to the samples a ten base size ladder (M) is shown, and the parental clone (P) from which the BAC library was constructed. The AFLP marker EAGT/MAGG-210 is indicated with an arrow.

Row and column pools were constructed of the library plates by pooling single well cell cultures. Plasmid DNA was isolated from these pools to prepare AFLP-template and to test the row and column pools for the presence of the AFLP marker. This procedure identified a set of putatively positive single BACs, which were then tested separately to determine the AFLP marker containing BACs. Contig construction For the construction of a contig enclosing the Sen-1 interval, fingerprints of individual BAC clones were produced by non-selective AFLP HindIII and TaqI primer amplification (Fig 2). Analysis of fingerprints from non-overlapping BAC clones can be used as control experiment to gain an impression of the chance of co-incidental comigration of non-homologous PCR-fragments. This was observed in a few cases among

26

Positional cloning of a wart disease resistance gene

very short fragments, which is consistent with the relative abundance of small sized fragments (below ≈150 bases).

Figure 2: The AFLP fingerprints of eight BAC clones (lanes 1-8) generated with non selective HindIII and TaqI primers. Next to the samples a ten base size ladder (M) is shown.

To ensure homology between co-migrating bands only the amplified fragments larger than 200 nucleotides were used during contig construction. The AFLP band mobilities were multiplied with ten to produce a functional dataset for FPC (Soderlund et al., 2000) which was used for contig construction of the selected BAC clones. This resulted in two contigs (Fig 3) of which the first enclosed the region starting with BAC 092E05 (positive for AFLP marker EACA/MCTC-127 which genetically mapped in Bin 35) through BAC 247A20 (positive for AFLP marker EAGG/MCTT-403 mapped in Bin 41). The second contig starts with BAC 202N21 (positive for AFLP marker EAAC/MCAT-307 mapped in Bin 41) and ends with BAC 115N09 (positive for marker EAGT/MACC-252 mapped in Bin 42). This last contig appeared to be very short, and is comprised of five BAC clones. All five BACs were positive for the AFLP-marker EAAC/MCAT-307 (clones 115N09, 122K18, 138N09, 202N21, 246B04), and among these, one BAC clone (115N09) was also positive for EAGT/MACC-252. This shows that the second contig includes the genetic position of distal recombination event of the Sen1-4 interval, between bin 41 and bin 42.

27

Chapter 2

Figure 3: The genetic and physical map of the Sen1-4 interval. In the top part (A) the marker saturated genetic map is shown (map units are indicated with Bin numbers). The genetic positions of recombination events are indicated with an X. Two adjacent recombination events result in an empty bin (not shown). In the middle part (B) the genetic and physical order of the AFLP markers from this region are shown. In the bottom part (C) the BAC contig is shown. Vertical arrows below the marker names indicate the BAC clones that were positive for the AFLP marker. The order and overlap of the BAC clones was determined by with FPC using BAC fingerprints with the HindIII/TaqI AFLP primer pair without selective nucleotides.

With FPC the order of the BACs and the individual bands obtained by BAC fingerprinting could be determined. The gap between the ‘bin35/41-contig’ and the ‘bin41/42-contig’ was located within bin 41, between marker EAGG/MCTT-403 and EAAC/MCAT-307, each landing on five BAC clones. Closer examination of the fingerprinting patterns showed one fragment of ~208 nucleotides long, which was common to BAC clones 247A20 and 202N21. According to the ‘bands file’ of FPC this shared fragment was indeed located at the farthest end in both contigs. This small overlap is insufficiently significant for FPC to join both contigs with the parameters used. Nevertheless, the locus specificy of co-migrating AFLP fragments and the distal placement of the band in either contig, offers sufficient confidence to conclude that the gap between the ‘bin35/41-contig’ and the ‘bin41/42-contig’ is connected by BAC clones 247A20 and 202N21. Discussion Inheritance of wart disease resistance In this paper we report the genetic mapping of a resistance gene against S. endobioticum race 1 on potato chromosome IV, and therefore propose the name Sen-4.1. The observation of a monogenic inherited resistance is seemingly in contrast to earlier genetic models. Earlier studies on the inheritance of wart disease race 1 by classical genetics have resulted in genetic models comprising at least two genes (Salaman and Lesley 1923, Lunden and Jørstad 1934, Maris 1973). However, the approach taken by Hehl et al. (1999) as well as our approach implies the correlation of phenotypic

28

Positional cloning of a wart disease resistance gene

segregation in a bioassay relative to loci on a linkage map. Both mapping studies show that a single locus is involved in resistance, albeit on different chromosomes (chromosome XI, Hehl et al., 1999; chromosome IV, this paper). Linkage mapping as such can only display heterozygous loci segregating in the mapping population, whereas the earlier genetic model were based on multiple crosses. Intercrosses between parental clones from this study and the study by Hehl et al. (1999) may allow to reconcile the classical genetic models and loci currently mapped. A duplicate gene model (AaBb x AaBb = 15:1), or epistatic interactions such as complementary genes (AaBb x AaBb = 9:7) or recessive suppression (AaBb x AaBb = 13:3) can be inferred from intercrosses between parents such as SH83-92-488 x H80.577/1 (resistant x resistant), SH83-92-488 x H80.576/16 (resistant x susceptible), RH89-039-16 x H80.577/1 (susceptible x resistant) and, RH89039-16 x H80.576/16 (susceptible x susceptible), as well as intercrosses of offspring clones. Comparing of the pedigrees of both mapping populations (data not shown; Dr. C. Gebhardt, personal communication) did not show any parents in common. In addition, it should be noted that resistance against S. endobioticum race 1 is commonly observed in approx. 60 % the modern potato cultivars. Therefore the relation between the two loci currently mapped and the classical genetic models remains inconclusive. Utility of the marker saturation offered by the ultra dense map of potato For potato a marker dense linkage map was constructed consisting of a maternal map comprising 4187 segregating marker loci and a paternal map comprising 3413 marker loci. The co-segregating marker loci are grouped in so-called bins, where each bin differs from its neighbouring bin(s) by one recombination event in the mapping population (Isidore et al., 2003). The marker density between the bins of the linkage map varies considerably, ranging from 0 up to 531 markers per bin, with the most marker dense bins occurring at the presumed centromeric regions of the chromosomes. The mapping of the resistance gene against S. endobioticum race 1 was performed by using this ultra dense genetic map of potato and adding the disease test results as a single marker locus. Sen1-4 was located between bin 35 and 42 of linkage group 4. All genetic positions within this interval have an equal likelihood to accommodate the R-gene due to the absence of specific plants in the disease test. Although the average number of markers used was less than two per bin for this region, which is lower than the average of 4.2 markers per bin it was possible to construct a contig. In fact the level of marker saturation is even more positive, because neither the AFLP markers of the PstI/MseI and SacI/MseI primer combinations, nor the 3:1 segregating AFLP markers have been exploited. This contig is spanning a genetic distance of ~6 cM (7 recombination events in 120 offspring) and a physical distance of ~1 Mb, including the minimal overlap between BAC clones 247A20 and 202N21 located within bin 41. This confirms that the level of marker

29

Chapter 2

saturation is adequate and equivalent to a marker spacing that is less than the average insert size of this BAC library. This allows BAC landing and contig construction at a low genetic resolution (≈ 100 offspring). The unknown order of the markers within a bin is not an obstacle for BAC landing. The BAC library may contain more BAC clones to reinforce the contig. This may apply not only to the 247A20 and 202N21 connection, but also to the other connections that do not have the expected ~5 fold deep coverage. Since the super pools of the BAC library have been surveyed only with AFLP markers, the best coverage is observed around the position where the marker lands on the BACs. The poorest coverage is observed between the marker positions where the end of a cluster of BACs meet the end of the next cluster. The results also describe the co-retention of AFLP markers in super pools and plate pools. The joint presence or absence of markers in the 64 super pools allows physical mapping of the markers, because the level of co-retention is proportional to the physical distance of the markers (Borm et al., 2003, de Boer et al., 2004). The below average number of markers per bin (including an empty bin 40) already suggested the limited physical size and, in connection to this, an elevated recombination frequency in this region. This is in agreement with the findings of Ganal et al. (1989), Young and Tanksley (1989) and King et al. (2002) that there is no direct relation between the genetic linkage intensities and the physical distance between markers over the genome. At this moment a genome wide physical map is constructed of the potato clone RH89-039-16 using markers generated with non-selective EcoRI/MseI AFLP fingerprinting. In combination with the AFLP markers mapped in the ultra dense genetic map, these contigs will be linked to the genetic map of potato. When this physical map is finished more can be said about the physical clustering and the genetic/physical ratio per region of EcoRI/MseI markers in the potato mapping population. Efficient identification of BACs via BAC pools Most BAC libraries include ten-thousands of BAC clones to provide sufficient genome coverage to ensure the presence of the desired fragment of the genome. To identify a single BAC clone of interest among all these BACs, without making extreme efforts, several pooling strategies have been developed. Generally speaking, we recognise two systems: (1) the step-wise screening of library plates, and rows and columns of within positive plates; (2) the assembling of pools in three or more dimensions. De-convolution of positive pool signals then leads directly to the BAC clone of interest (Klein et al., 2000; Rogel-Gaillard et al., 2001; Whisson et al., 2001). This might be efficient for genome wide physical map construction (Klein et al., 2000), where all information gained can be used to construct contigs and a large numbers of markers are used to screen all the pools.

30

Positional cloning of a wart disease resistance gene

However, for local contig construction with only a few markers across a small interval, the assembling of multi-dimensional BAC pools will take too much effort relative to the number of markers by which the pools are going to be screened. In this study we have used a combination of the step-wise and the dimensional pooling strategy. At the level of library plates we constructed two-dimensional row-column super-pools, because the low number of positive plates to be expected (5.5 per 255 plates) allows to obtain a reduction of the number of PCR steps (255 to 64), without a severe trade-off of gaining many false positives. If multiple super pools (rows and columns) were positive, the de-convolution will result in a few false positives, which could be verified at the plate pool level easily. Similarly, combining of two rows and two columns will result in fewer PCR reactions to identify the BAC clone within a plate. With four additional PCR reactions the single positive BAC clone can be distinguished easily from the three false positives, from the four putatively positive clones suggested by the row and column pools. Although it is not easy to provide quantitative figures, we suggests that it is more efficient to use this step by step pooling design and verify false positives, than investing into a pooling strategy by which no false positives arise, but always a very large number of PCR reactions have to be tested. Towards cloning of Sen1-4 So far, we have a genetic map of ~6 cM and a physical contig of ~1 Mb, comprising the Sen1-4 locus, without the resolution that could be gained from the analysis of recombinants from a large number of offspring. With this, we have demonstrated that the concept of BAC landing (Tanksley et al., 1995) does not require the genetic resolution of many recombinants, but only the marker saturation as provided by our dense linkage map (Isidore et al., 2003). For future research we have selected a minimal tiling path of 14 BAC clones. These BAC clones could be used for direct transformation, if it were not possible to postulate candidate genes. However, for the cloning of a disease resistance gene, it is more obvious to screen the 14 BACs for R-gene like sequences via Southern analysis or with primers based on the highly conserved NBS-motif (e.g. Van der Linden et al., 2004). BACs that contain such resistance gene analogs (RGA’s) can be subcloned and sequenced. Functional open reading frames can be used for complementation studies as well, if BAC transformation is deemed to be problematic. Complete DNA sequencing of the minimal tiling path is not inconceivable either. In conclusion, the added value of the genetic resolution gained by screening many offspring for recombinants is a priory doubtful. Moreover, it will require two growing seasons to generate sufficient numbers and sizes of potato tubers from those recombinants for the wart disease bio-assay. Only when large numbers of RGA’s are detected across many BACs, the resolving power of genetic recombination should be employed to narrow the interval and the number of candidate

31

Chapter 2

RGA’s. Irrespective of the procedure chosen, the material required for the bio-assay to confirm cloning of the correct Sen1-4 resistance gene will take at least several years. By using a marker saturated linkage map of potato it was possible to genetically map the resistance gene against Synchitrium endobioticum race 1 and to detect sufficient BAC clones originating from this region. By performing non-selective AFLP the overlap between different BAC clones could be determined resulting in a contig and a minimal tiling path enclosing the Sen1-4 locus. This variation of ‘BAC landing’ has also been termed ‘contig seeding’ or ‘contig initiation’ (Bryan et al., 2002). Acknowledgements We gratefully acknowledge the breeding company Averis Seeds BV, Valthermond for the wart disease bio-assay on the parents and offspring clones of the mapping population. This research was supported by the Dutch Technology Foundation STW http://www.stw.nl/projecten/W/wpb5283.html (grant WPB.5283)

32

Positional cloning of a wart disease resistance gene

References Bakker E, Achenbach U, Bakker J, Van Vliet J, Peleman J, Segers B, Van der Heijden S, Van der Linde P, Graveland R, Hutten R, Van Eck HJ, Coppoolse E, Van der Vossen E, Bakker J, Goverse A (2004) A high-resolution map of the H1 locus harbouring resistance to the potato cyst nematode Globodera rostochiensis. Theor Appl Genet 109(1):146–152. Bent AF, Kundel BN, Dahlbeck D, Brown KL, Schmidt R, Giraudat J, Leung J, Staskawicz BJ (1994) RPS2 of Arabidopsis thaliana: a leucine-rich repeat class of plant disease resistance genes. Science 265:18561860 De Boer JM, Borm TJA, Brugmans B, Bakker ER, Bakker J, Visser RGF, Van Eck HJ (2004) Construction of a genetically anchored physical map of the potato genome. Plant & Animal Genomes XII Conference, San Diego, CA. http://www.intl-pag.org/pag/12/abstracts/W54_PAG12_245.html Borm T, Brugmans B, De Boer J , Van der Vossen E , Bakker J, Visser R, Van Eck HJ (2003) Bac-pool mapping: a method for physical distance estimation. Plant and Animal Genome, January 11-15, 2003, Town & Country Convention Center, San Diego, CA http://www.intlpag.org/11/abstracts/P2d_P130_XI.html Bryan G, Milbourne D, Isidore E, McNicoll J, Tierney I, Purvis A, Williamson S, Ramsay L, McLean K, Waugh W (2002) Application of a potato UHD genetic linkage map for BAC landing and contig initiation in a region of linkage group V. S.C.R.I. Ann. Rep 2001/2002. http://bitrws400.scri.sari.ac.uk/Document/AnnReps/02Indiv/22UHD.pdf Buntjer JB (2000) Cross Checker: Computer Assisted Scoring of Genetic AFLP Data. Abstract Plant & Animal Genome VIII Conference., San Diego., CA., January 9-12., 2000. http://www.intlpag.org/pag/8/abstracts/pag8664.html. Büschges R, Hollricher K, Panstruga R, Simons G, Wolters M, Frijters A, van Daelen R, van der Lee T, Diergaarde P, Groenendijk J, Töpsch S, Vos P, Salamini F, Schultze-Lefert P (1997) The barley Mlo gene: a novel control element of plant pathogen resistance. Cell 88:695-705 Cai DG, Kleine M, Kifle S, Harloff HJ, Sandal NN, Marcker KA, Klein-Lankhorst R, Salentijn EMJ, Lange W, Stiekema WL, Wyss U, Grundler FMW, Jung C (1997) Positional cloning of a gene for nematode resistance in sugar beet. Science 275:832-834 Dixon MS, Jones DA, Keddie JS, Thomas CM, Harrison K, Jones JDG (1996) The tomato Cf-2 disease resistance locus comprises two functional genes encoding leucine-rich repeat proteins. Cell 84:451-459 Ganal MW, Young ND, Tanksley SD (1989) Pulsed-field gel electrophoresis and physical mapping of large DNA fragments in the Tm-2a region of chromosome 9 in tomato. Mol Gen Genet 215:395-400 Grant MR, Godiard L, Straube E, Ashfield T, Lewald J, Sattler A, Innes RW, Dangl JL (1995) Structure of the Arabidopsis RPM1 gene enabling dual specificity disease resistance. Science 269:843-846 Hampson MC (1996) A quantitative assessment of wind dispersal of resting spores of Synchytrium endobioticum, the causal agent of wart disease of potato. Plant Dis 80:779-782 Harkins DM, Johnson GN, Skaggs PA, Mix AD, Dupper GE, Devey ME, Kinloch BB, Neale DB (1998) Saturation mapping of a major gene for resistance to white pine blister rust in sugar pine. Theor Appl Genet 97 : 1355-1360 Hehl R, Faurie E, Hesselbach J, Salamini F, Whitham S, Baker B, Gebhardt C (1999) TMV resistance gene N homologues are linked to Synchytrium endobioticum resistance in potato Theor Appl Genet 98 : 379– 386. Heilig JS, Lech K, Brent R (1997) Large-scale preparation of plasmid DNA. In: Ausubel FM, Brent R, Kingston RE, Moore DD, Seidman JG, Smith JA, Struhl K (eds) Current protocols in molecular biology. John Wiley and Sons Inc, USA, pp 1.7.1-1.7.3.

33

Chapter 2

Huang S, Vleeshouwers VGAA, Werij JS, Hutten RCB, Van Eck HJ, Visser RGF, Jacobsen E (2004) The R3 Resistance to Phytophthora infestans in potato is conferred by two closely linked R genes with distinct specificities. MPMI 17(4):428-335. Isidore E, Van Os H, Andrzejewski S, Bakker J, Barrena I, Bryan GJ, Buntjer J, Caromel B, Van Eck HJ, Ghareeb B, De Jong W, Van Koert P, Lefebvre V, Milbourne D, Ritter E, Rouppe van der Voort JNAM, Rousselle-Bourgeois F, Van Vliet J, Waugh R (2003) Toward a marker-dense meiotic map of the potato genome: lessons from linkage group I. Genetics 165(4):2107-2116. King J, Armstead IP, Donnison IS, Thomas HM, Jones RN, Kearsey MJ, Roberts LA, Thomas A, Morgan WG, King IP (2002) Physical and genetic mapping in the grasses Lolium perenne and Festuca pratensis. Genetics 161:315-324 Klein RR, Morishige DT, Klein PE, Dong J, Mullet JE (1998) High throughput BAC DNA isolation for physical map construction of sorghum (sorghum bicolor). Plant Mol Biol Rep 16:351-364 Klein PE, Klein RR, Cartinhour SW, Ulanch PE, Dong J, Obert JA, Morishige DT, Schlueter SD, Childs KL, Ale M, Mullet JE (2000) A high-troughput AFLP-based method for constructing integrated genetic and physical maps: progress toward asorghum genome map. Genome Research 10:789-807 Lunden AP and Jørstad J (1934) Investigations on the inheritance of immunity to wart disease (Synchytrium endobioticum (Schilb.)Perc.) in the potato. Genetics 29:375-385 Maris B (1973) Studies with potato dihaploids on the inheritance of resistance to wart disease. Potato Research 16: 324 Martin GB, Brommonschenkel SH, Chunwongse J, Frary A, Ganal MW, Spivey R, Wu T, Earle ED, Tanksley SD (1993) Map-based cloning of a protein kinase gene conferring disease resistance in tomato. Science 262:1432-1436 Michelmore RW, Paran I, Kesseli RV (1991) Identification of markers linked to disease resistance genes by bulked segregant analysis: a rapid method to detect markers in specific genomic regions by using segregating populations. Proc Natl Acad Sci USA 88:9828-9832 Mindrinos M, Katagiri F, Yu GL, Ausubel FM (1994) The Arabidopsis thaliana disease resistance gene RPS2 encodes a protein containing a nucleotide-binding site and leucine-rich repeats. Cell 78:1089-1099 Rogel-Gaillard C, Piumi F, Billault A, Bourgeaux N, Save J, Urien C, Salmon J, Chardon P (2001) Construction of a rabbit bacterial artificial chromosome (BAC) library: application to the mapping of the major histocompatibility complex to position 12q1.1. Mam.Genome 12:253-255 Rouppe van der Voort JNAM, Van Zandvoort P, Van Eck HJ, Folkertsma RT, Hutten RCB, Draaistra J, Gommers FJ, Jacobsen E, Helder J, Bakker J (1997) Use of allele specificity of comigrating AFLP markers to align genetic maps from different potato genotypes. Mol Gen Genet 255(4):438 – 447. Rouppe van der Voort JNAM, Wolters P, Folkertsma RF, Hutten RBC, Van Zandvoort P, Vinke H, Kanyuka K, Bendahmane A, Jacobsen E, Janssen R, Bakker J (1997) Mapping of the cyst nematode resistance locus Gpa2 in potato using a strategy based on comigrating AFLP markers. Theor Appl Genet 95 : 874– 880 Rouppe van der Voort J, Kanyuka K, Van der Vossen E, Bendahmane A, Mooijman P, Klein-Lankhorst R, Stiekema W, Baulcombe D, Bakker J (1999) Tight physical linkage of the nematode resistance gene Gpa2 and the virus resistance gene Rx on a single segment introgressed from the wild species Solanum tuberosum subsp. andigena CPC 1673 into cultivated potato. MPMI 12:197-206. Salaman RN and Lesley MA (1923) Genetic studies in potatoes; the inheritance of immunity to wart disease. Genetics 13:177-186. Sambrook J, Fritsch EF, Maniatis TT (1989) Molecular Cloning: A laboratory manual, 2nded. (Cold Spring Harbor), NY: Cold Spring Harbor Laboratory Press.

34

Positional cloning of a wart disease resistance gene

Soderlund C, Humphray S, Dunham A, French L (2000) Contigs built with fingerprints, markers and FPC V4.7. Genome Res. 10:1772-1787. Spieckermann A, Kotthoff P (1924) Die Prüfung von Kartoffeln auf Krebsfestigkeit. Deut. Landw. Presse 51:114-115. Stam P. (1993) Construction of integrated genetic linkage maps by means of a new computer package: JoinMap. Plant J. 3: 739-744. Tanksley SD, Ganal MW, Martin GD (1995) Chromosome landing: a paradigm for map-based gene cloning in plants with large genomes Trends Genet. 11:63-68. Van der Linden CG, Wouters DCAE, Mihalka V, Kochieva EZ, MJM Smulders, Vosman B (2004) Efficient targeting of plant disease resistance loci using NBS profiling. Theor Appl Genet 109(2):384-393. Vos P, Hogers R, Bleeker M, Rijans M, Van der Lee T, Hornes M, Frijters A, Pot J, Peleman J, Kuiper M, Zabeau M. (1995) AFLP: A new technique for DNA fingerprinting. Nucl Acids Res 23: 4407-4414. Whisson SC, Van der Lee T, Bryan GJ, Waugh R, Govers F, Birch PR. (2001) Physical mapping across an avirulence locus of Phytophthora infestans using a highly representative, large-insert bacterial artificial chromosome library. Mol Genet Genomics 266(2):289-95. Young ND, Tanksley SD (1989) RFLP analysis of the size of chromosomal segments retained around the Tm-2 locus of tomato during backcross breeding. Theor Appl Genet 77:353-359.

35

Chapter 2

36

Chapter 3 A novel method for the construction of genome wide transcriptome maps Bart Brugmans, Asun Fernandez del Carmen, Christian W.B. Bachem, Hans van Os, Herman J. van Eck and Richard G.F. Visser Slightly modified from The Plant Jounal (2002) 31(2), 211-222

37

Chapter 3

38

The construction of genome wide transcriptome maps

A novel method for the construction of genome wide transcriptome maps Summary Expression profiling by cDNA-AFLP is commonly used to display the transcriptome of a specific tissue, treatment or developmental stage. In this paper cDNA-AFLP has been used to study transcripts expressed in segregating populations from Arabidopsis thaliana and potato (Solanum tuberosum). The genetic differences between the offspring genotypes are thus visualised as polymorphisms in the transcriptome. We show that polymorphic transcripts can be used as genetic markers and allow the construction of a linkage map. The resulting map shows that, in contrast to genomic markers, the transcriptome-derived markers did not clustered in particular areas of the chromosome and that cDNA-AFLP markers are targeted specifically to transcriptionally active regions. The cDNA-AFLP markers used in the mapping are derived from DNA polymorphisms in transcripts, rather than differences in expression regulation. The high potential of transcriptome markers as opposed to (anonymous) genomic markers for applications in genetic analyses, marker-assisted breeding, as well as map based cloning via BAC landing is discussed. Introduction Recently, substantial efforts have been made to unravel the DNA sequences of entire genomes (Wilson et al., 1999). Complete sequence data is now available for numerous species including two small plant genomes: Oryza sativa and Arabidopsis thaliana with several other plant species well on the way to completion (Arabidopsis-Genome-Initiative 2000). However, a wide range of other important plant species, particularly those with large genomes are not likely to be comprehensively sequenced in the near future. In such cases, one of the most productive approaches to retrieve important genomic information is the use of genetic mapping techniques to locate genes responsible for a particular trait. Linkage maps from many species have been established using various marker systems. Besides RFLP and RAPD markers, the amplified fragment length polymorphism (AFLP) method is one of the most widely used techniques. AFLP is a PCR based fingerprinting technique, which efficiently identifies DNA polymorphisms without prior sequence information (Vos et al., 1995). AFLP relies on the selective amplification of a subset of molecules from a more complex template pool. There are many reports of AFLP technology being used for the construction of genetic maps (Becker et al., 1995; van Eck et al., 1995) or local marker saturation (Simons et al., 1997). In most cases, maps are constructed for the localisation of genes responsible for qualitative or quantitative traits, as

39

Chapter 3

well as for positional cloning. For such applications, however, the ideal distribution of markers is not necessarily regular spacing across the whole genome but rather, a concentration of markers in the coding regions of the genome. Since AFLP generally produces a distribution of markers that is biased towards non-coding centromeric regions of the chromosome (Vuylsteke et al., 1999), this may be regarded as a disadvantage for QTL studies or positional cloning. The cDNA-AFLP method is a robust method to generate RNA fingerprints, and has been developed to visualise gene expression (Bachem et al., 1996). By definition, cDNAAFLP targets only coding regions of the genome and like AFLP it does not require prior sequence information. Until now, cDNA-AFLP has been used for visualisation of differential gene expression and as a tool for the isolation of genes (Bachem et al., 1998; Bachem et al., 2000, Bachem et al., 2001; van der Biezen et al., 2000 Dellagi et al., 2000; Durrant et al., 2000). Transcript-derived fragments (TDF) obtained using cDNA-AFLP have also been mapped by converting them into RFLP-markers and linking them to a genetic map (Suárez et al., 2000). Conversion of cDNA-AFLP fragments into RFLP-markers is, however, time consuming and prone to artefacts. Direct mapping of transcripts with the cDNA-AFLP technique without further conversion steps would overcome these problems and result in a rapid and practical method of direct mapping of expressed genes. Information on expressed genes is currently being collected in EST databases where the map position of the gene is generally not known. Positional information on genes is, however, essential for correlating mapped phenotypes with genes at corresponding chromosomal positions allowing a candidate gene approach. An example of a genome wide candidate gene approach is presented by Chen et al. (2001) where sequences of known genes were converted into PCR-based markers to map the position of the candidate genes. Considerable knowledge of the metabolic pathway and corresponding genes involved in the development of the trait is required for the success of this approach. In this paper, we describe a novel application of cDNA-AFLP to visualise differences in the transcriptome between offspring genotypes without prior sequence data. To our knowledge, this is a first application of a differential display method for genetic analysis, as recently anticipated by Jansen and Nap (2001). Transcriptome variation caused by the genetic differences between segregating offspring genotypes allowed us to use individual transcripts as genetic markers. To construct transcriptome maps we used diploid potato and Arabidopsis thaliana. Diploid potato was used as a population from a cross between heterozygous and it's segragation pattern (Ritter et al., 1990) and Arabidopsis thaliana RIL population (Lister and Dean, 1993) was used as an example of a simple segregating population with homozygous parents. Furthermore, we show that transcript markers do not

40

The construction of genome wide transcriptome maps

cluster in centromeric regions and that most absence/presence polymorphisms are based on single nucleotide polymorphisms rather than on expression differences. Results Genetic diversity of the transcriptome visualised by cDNA-AFLP fingerprints The cDNA-AFLP fingerprints generated from potato shoot tissue mRNA isolates contained between 35 and 112 amplification products with an average of 58 bands per lane (36 primer combinations were tested). The fragment sizes were distributed evenly between 50 and 500 bp. Figure 1 shows a section of an autoradiogram image where each lane represents a different genotype of a potato mapping population including the parents.

Figure 1: A section of a representative phosphor-image. Lane 1 represents a size marker. Lane 2 represents the maternal parent and lane 3 represents the male parent. The other lanes represent offspring from these two parents. "A" is an absence/presence polymorphism originating from one male parent. "B" is an intensity polymorphism. "C" is an absence/presence markers heterozygous in both parents., a so called < ab × ab > marker. "D" is an absence/presence polymorphism originating from the female parent with intensity differences between offspring with the band. On the left the mobility is indicated expressed nucleotides based on the size marker.

Variation between the fingerprints was used to study differences between genotypes at the transcriptome level. The majority of the bands were monomorphic and did not show marked variation in intensity between descendants of a diploid potato mapping population. In contrast, other TDFs showed large intensity differences that varied from complete absence or levels just above the background to a signal strength that exceeded the

41

Chapter 3

capacity of the phosphor-screens. The observed intensity differences are likely to represent >100 fold differences in gene expression. As shown in Figure 1, the polymorphic transcripts can be grouped into different classes. First, there are absence/presence polymorphisms that show a 1:1 or 3:1 segregation ratio according to the mendelian models , or , where the allele represents the absence of a band and allele represents the amplification of a TDF. These 1:1 and 3:1 segregating polymorphisms can also be found in genomic AFLP fingerprints and suggest single-nucleotide-polymorphisms (SNP) or insertion/deletions (indels) events in the coding sequence of one of the alleles. In addition to these absence/presence polymorphisms, we also observed less pronounced intensity polymorphisms within the absence/presence polymorphisms. The attributes of the polymorphisms observed are shown in Table 1. Table 1: Evaluation of genetic variation in the transcriptomes of potato siblings: description and number of cDNA-AFLP polymorphisms in 36 TaqI/AseI primer combinations.

Type of polymorphism

Absence / presence

Absence / presence

Absence / presence

Intensity variation

Monomorphic

Genetic model*

ab × aa

aa × ab

ab × ab

n.d.

n.d.

Number

117

112

102

12

1733

Expected Mendelian Ratio

1:1

1:1

1:3

1:0

Where allele a represents the absence of a band and allele b represents the presence of a transcript. n.d. = not determined.

The total number of polymorphic transcripts that could be scored per primer combination varied between one and 19 with an average of nine polymorphisms per lane. No significant correlation was observed between the GC-content of the selective nucleotides and the level of polymorphism. In total 331 polymorphic transcripts could be unambiguously scored in the 90 offspring genotypes and their parents. The reproducibility of the cDNA-AFLP technique was verified by duplicating fingerprints generated from seven genotypes sampled and processed earlier for a pilot study (data not shown). The reliability of the technique was also assessed by comparing segregation data of cDNA-AFLP fingerprints with data from flanking loci obtained from genomic AFLP. No major differences in location or distance between makers varied by adding cDNA-AFLP markers to the genetic map. The potato mapping population used in this study descended from a cross between highly heterozygous non-inbred diploid parents, which descend from vegetatively maintained tetraploid cultivars that are known for a high genetic load. Therefore, a RIL mapping population Col × Ler of Arabidopsis thaliana (Lister and Dean, 1993) was also

42

The construction of genome wide transcriptome maps

used to study genetic variation in the transcriptome of homozygous siblings. RNA was isolated from Arabidopsis seedlings two weeks after germination. Plant age was the only criterion to obtain plant tissue of the same stage of development. cDNA-AFLP fingerprints of Arabidopsis resulted in one to seven clear absence/presence polymorphisms per lane. These markers have been added to the public marker data set of Col × Ler (www.arabidopsis.org.uk), and could be simply mapped. This provides further evidence that transcriptome based markers obtained with cDNA-AFLP fingerprints of siblings, are as abundant and reliable as genomic markers. Furthermore, the cDNA-AFLP markers from Arabidopsis thaliana were well distributed across all chromosomes of the map of (Figure. 2).

Figure 2: The Arabidopsis thaliana transcriptome map. The AseI/TaqI markers are cDNA-AFLP markers. The RFLP markers from the map of Lister and Dean, (1993) have been omitted to reduce the complexity of this figure. The markers were used however as a JoinMap-fixed-order-file for producing this map.

43

Chapter 3

The construction of a transcriptome map of potato In order to establish a dense transcriptome map, TDFs with clear segregation patterns were used to generate separate maps of each parental set of gametes with JoinMap 2.0 software. In total 112 polymorphic transcripts descended from the female parent, 117 from the male parent and 102 transcripts were heterozygous in both parents < ab × ab >. This resulted in 214 markers for the female and 219 markers for the male chromosome maps, which were processed together with the dataset of genomic AFLP and RFLP markers of van Eck et al., (1995). At a LOD threshold of 4.0 14 maternal linkage groups were identified, and at a LOD threshold of 5.0 we identified 13 paternal linkage groups. The correct number of 12 linkage groups for each parent could be established by connecting low LOD gaps within a linkage group via the allele bridges with the homologous linkage group in the other parent. The final map, agreed well with the map obtained previously (van Eck et al., 1995). The maps are shown in Figure 3 and in analogy with terminology such as “isozyme map” or “genome map” we call this a “transcriptome map”. The markers labelled AseI/TaqI are mRNA polymorphisms as visualised with cDNA-AFLP. The map shows that the markers are scattered over all the chromosomes. In four cases, a pair of markers of equal mobility mapped to the same locus and differed only in one selective nucleotide. Otherwise no obvious clustering of cDNA-AFLP markers is found.

44

The construction of genome wide transcriptome maps

45

Chapter 3

46

The construction of genome wide transcriptome maps

Figure 3: The maternal (C) and paternal (E) transcriptome maps of diploid potato. The AseI/TaqI markers are cDNA-AFLP markers. Marker names ending with "h" are heterozygous in both parents ., and allow to connect both parental maps. Markers starting with TG are tomato RFLPs (Bonierbale et al., 1989; Tanksley et al., 1992). Markers starting with GP are potato RFLP markers (Gebhardt et al., 1991). Markers starting with TPI., MDH., DIA., GOT and APS are isozyme loci (Jacobs et al., 1995). All the other letters and "+" stand for different RFLP markers as described in Jacobs et al. (1995). The genomic AFLP markers from the map of Van Eck et al (1995) are shown with an asterix only to reduce the complexity of this figure.

47

Chapter 3

Origin of the transcript polymorphisms The origin of absence/presence polymorphisms in transcripts could be either due to, genetic differences in gene expression or sequence polymorphisms in the coding region. To determine which of these options was the more likely, internal primers were designed for four transcripts that showed an absence/presence polymorphism, and RT/PCR was performed on RNA from the genotypes showing the cDNA-AFLP band polymorphism. All four primer-pairs were tested on undigested cDNAs of ten offspring genotypes, of which five showed amplification of the TDF with cDNA-AFLP. The resulting amplification products are shown in Figure 4. All ten offspring genotypes showed a fragment of the expected length, indicating that, irrespective of the cDNA-AFLP polymorphism, these genotypes expressed the target gene.

Figure 4: RT-PCR of transcripts corresponding to four segregating TDFs. The origin of four 1:1 segregating absence/presence cDNA-AFLP markers is caused by sequence polymorphism affecting the cDNA-AFLP amplification product. PCR amplification of RNA from 10 offspring using internal primers, demonstrates the presence of these four transcripts in the steady-state RNA pool.

The same primers were also used to amplify genomic DNA of the mapping population. One of resulting amplification products had the expected length and in three cases, intron-spanning primers gave a larger product. Digestion with several restriction enzymes, including AseI and TaqI resulted in polymorphisms. The segregation of the CAPS markers that were obtained from these four cDNA-AFLP loci matched the segregation of the cDNA-AFLP marker and therefore map at the same locus. From this we conclude that the vast majority of the absence/presence polymorphisms are due to genomic sequence polymorphisms, probably SNPs at the sites of the selective nucleotides or restriction sites essential for amplification of the fragment with cDNA-AFLP. We conclude that the map position of transcript polymorphisms correspond to the map position of the genes from which they were transcribed.

48

The construction of genome wide transcriptome maps

Sequence analysis An arbitrary selection of twenty segregating cDNA-AFLP bands have been isolated from gel, re-amplified and sequenced. Subsequently, the sequences were analysed for homology to genes in the sequence databases. For 18 transcripts, significant matches -5

were found. Matches with an E-value of less than 1.0 x 10 were considered as significant. The results are listed in Table 2. Table 2: Accession numbers, E-values and annotated function of sequences with the best homology to potato cDNA-AFLP fragment sequences excised from transcript fingerprints. 1

BLASTx cDNA-AFLP fragment

Accession number

TBLASTx

E-value

AseIAA/TaqIGC-235h

Accession number

E-value

AI899537

4.00E-29

Annotated function. Coatamer required for vesicle budding and anterogarade vesicle transport from and to the Golgi 2

unknown

AseIAA/TaqIGC-139c AseIAA/TaqIGC-137h AseIAA/TaqIGC-135h

AI775597

3.00E-09

unknown function

AI775597

2.00E-05

unknown function

AseIAA/TaqIGG-540h

P36181

8.00E-19

HSP80 homologue (heat shock protein)

AseIAA/TaqIGG-214e

AJ133755

1.00E-09

Peptidyl-prolyl cis-trans isomerase (required for mitotic cell cycle progression)

AseIAA/TaqIGG-169h

AJ132397

7.00E-06

Major latex protein

AseIAA/TaqITC-445h

AJ222713

4.00E-18

NAP gene homologue (no apical meristem) 2

unknown

AseIAA/TaqITC-344c

2

unknown

AseIAA/TaqITC-315e AseIAA/TaqITC-173c

T05027

1.00E-05

Reverse transcriptase 2

unknown

AseIAA/TaqITC-158c AseIAC/TaqIAA-418h

M59857

6.00E-60

AseIAC/TaqIAA-358h

P51414

4.00E-14

Stearoyl-acyl-carrier protein desaturase Ribosomal protein 2

unknown

AseIAC/TaqIAA-217e AseIAC/TaqIAA-205h AseIAC/TaqIAA-200h AseIAC/TaqIAA-138c

AJ237988

4.00E-07

Putative ripening-related protein BF459608

2.00E-18

PG type gene (fruit ripening) 2

Unknown

1 tblastx was carried out using the "EST others" database and only in cases where the blastx yielded no significant homologies. 2 cDNA-AFLP fragments yielding no homologies with any searches.

49

Chapter 3

A review of the sequence similarities, show that there is no functional or metabolic relationship between the sequences similarities, suggesting that the cDNA-AFLP transcript mapping method does not produce a bias for a particular class of genes or for a particular metabolic pathway. Discussion Here we describe a novel implementation of cDNA-AFLP for the visualisation of genetic variation within a mapping population and the targeted development of markers for mapping of the transcribed regions of the genome – the transcriptome. Jansen and Nap (2001) recently discussed the potential uses of genetics combined with genomics. They stated that the merger of expression profiling and genetic analysis will combine the power of two different worlds in a way that is likely to become instrumental in the further unravelling of metabolic, regulatory and developmental pathways. The experimental work described in this paper does not completely support the opinion of Jansen and Nap. Firstly, our results show that the majority of the transcripts show a monomorphic pattern, which indicates a high level of expression uniformity in spite of high levels of genetic and phenotypic variation between these descendants. However, the origin of phenotypic variation could also arise from sequence polymorphisms that could not be visualised by a single template. In contrast, hybridisation based gene expression analysis would completely fail to correlate phenotypes with allelic versions of a gene that differ by one or more SNPs in the coding region. Secondly, a surprisingly low level, ( 3.0. For potato, separate maternal and paternal maps were established. The 110 maternal cDNAAFLP markers could be placed with LOD values > 4.0 and 108 paternal cDNA-AFLP markers could be placed with a LOD threshold > 5.0. This indicates that cDNA-AFLP markers have equal power in genetic analysis as any other molecular marker type currently used.

51

Chapter 3

Most markers have a unique segregation pattern, which resulted in a uniform distribution of cDNA-AFLP markers along the chromosomes of potato. Inspection of the raw data did not show ambiguities (singletons) for the placement of cDNA-AFLP markers. In the maternal map (C) of potato, three marker pairs were found with (1) the same mobility, (2) only one difference in selective nucleotides and (3) at the same map position. These marker pairs could be alleles of the same locus or represent repeated sequences at the same locus. A further possibility is that such markers may be generated by mispriming, however, it has been shown previously (Bachem et al., 1998) that mis-priming events only occure very rarely in connection with highly abundant transcripts. Estimate of the number of transcribed genes The number of fragments that are amplified with cDNA-AFLP allows a rough estimate to be made of the total number of genes of potato. In total 2,085 fragments were visualised using 36 PCs, which is an average of 58 per PC. All 256 +2/+2 AseI/TaqI primer combinations would visualize 14,848 fragments. A recent analysis of restriction site frequencies shows that in potato the AseI / TaqI enzyme combination is likely to cut only around 24% of all full length cDNAs and that on average these cDNAs produce 1.7 cDNAAFLP fragments. This gives rise to an estimated number of transcribed genes in the aerial plant tissues of around 36,000, which is similar to previous estimates (Bachem et al., 2000). This estimate in potato exceeds the estimate of the number of genes of A. thaliana (The Arabidopsis Genome Initiative 2000). Ku et al., (2000) made a calculation about the total number of genes in tomato, another Solanaceae species. These calculations are based on the similarity between A. thaliana and the sequence a 105 Kb bacterial artificial chromosome of tomato. They suggested that the number of genes could be up to 145,000 genes. Our results only suggest that there may be more than the 25,000 genes of A. thaliana. Application of cDNA-AFLP transcript polymorphisms In this study, we have focussed on the application of transcript polymorphisms for marker development and the construction of a transcriptome map. We have shown that cDNAAFLP markers could be converted in CAPS markers. CAPS markers are more easily applied in marker assisted breeding, but CAPS markers derived from cDNA-AFLP have the advantage that they specifically target transcribed genes. In future studies we wish to analyse the transcripts with quantitative variation in expression level in more detail. For example, we wish to understand whether this expression variation can be used to map the locus of transcription factors. Furthermore, we intend to correlate phenotypic variation with these quantitatively and qualitatively

52

The construction of genome wide transcriptome maps

segregating transcript markers, which is a genome wide extension of the candidate gene approach. Finally, we wish to exploit this technique for map based cloning. BAC landing is currently achieved with genomic markers, but local saturation of a map with bulked segregant analysis (BSA) of cDNA-AFLP markers would increase the efficiency of BAC landing, possibly into “gene landing”. Experimental Procedures Plant material A diploid mapping population of 90 F1 diploid potato genotypes descending from the noninbred parents USW5337.3 (clone C) × 77.2102.37 (clone E). Clone E was obtained from a cross between clone C and VH34211 and therefore this F1 population resembles a backcross. This mapping population has been used in previous genetic studies (Jacobs et al., 1995, Van Eck et al., 1995), and has been maintained in vitro for several years. For this experiment, shoot cuttings were put on MS medium (Murashige and Skoog 1962) and harvested after four weeks, with all plants as much as possible in the same stage of development. Shoot tissue (1-5 gram) was harvested and ground to a fine powder under liquid nitrogen and stored at -80°C until RNA isolation. The Arabidopsis RILs and their parents were sown on MS20 medium (Murashige and Skoog 1962) and seedling tissue was harvested after two weeks. No specific efforts were taken to make sure that the lines were in the same state of development. The plants were harvested and ground under liquid nitrogen and stored at -80°C until RNA isolation. mRNA isolation, cDNA synthesis and template preparation Total RNA was isolated from the ground plant material. The total RNA concentration was +

estimated by visualisation on a 1% agarose gel. The Poly-A RNA fraction was extracted from 10µg of total RNA using poly-d[T]25V oligonucleotides coupled to paramagnetic beads (Dynal A.S. Oslo, Norway). First and second strand cDNA synthesis was carried out according to standard protocols (Sambrook et al., 1989). The volumes for mRNA isolation and cDNA synthesis were adjusted to facilitate handling in micro-titre plate-format such that 92 samples could be processed simultaneously. The final volume after cDNA synthesis was 50µl. 5µl of the reaction mix was analysed on a 1% agarose gel to estimate the concentration. The concentration was equalised to 10 ng/µl. 25µl was subjected to the standard AFLP template production (Bachem et al., 1996). Restriction enzymes used for the template preparation were AseI and TaqI. Further steps to provide template for cDNAAFLP were carried out as described previously (Bachem et al., 1996). All amplification reactions were done on a PE-9600 thermocycler using Taq DNA polymerase (PE Biosystems, Foster City, CA, USA). All oligonucleotides used in the cDNA-AFLP

53

Chapter 3

procedure were obtained from Eurogentec (Eurogentec, Seraing, Belgium). All enzymes were from Life Technologies (Gaithersburg, MD, USA) with the exception of AseI and TaqI, which were from NE-Biolabs Inc. (New Brunswick, NE, USA). Radioactive cDNA33

AFLP and gel analysis Radioactive labelling of primers was carried out using γ P-dATP and conditions for PCR were according to Bachem et al., (1998). Samples were denatured and separated on a 4,5% polyacrylamide sequencing type gels. Gels were transferred onto Whatman 3MM paper and dried using a slab gel dryer. After 20 hours of exposure to a phosphor screen, the labelled DNA fragments were visualised by phosphor imaging using the STORM 860 (Amersham Pharmacia biotech, Uppsala, Sweden). The gels were also exposed to X-ray films and developed after 3 days of exposure. Data analysis and map construction Data on transcript polymorphisms were collected by interpretation of autoradiograms and outputs from the phosphor-imager (STORM 860, Amersham pharmacia biotech, Uppsala, Sweden) visually and quantitatively by the image interpretation software CrossChecker (Buntjer, 2000; http://www.dpw.wau.nl/pv/pub/CrossCheck/download.html). In total 36 random primer combinations (PC) were used to fingerprint the potato population of 90 individuals and two parents. For Arabidopsis only 8 random PCs were used to fingerprint the RILs. Marker nomenclature is based on the restriction enzymes used to generate cDNA-AFLP template, the selective nucleotides of the primers and the mobility of the band relative to a 10 base ladder (SequaMark, Research Genetics). The additional letter c, e or h included in the potato marker name indicates whether the marker locus is heterozygous in the maternal or paternal parent or in both. Segregating transcripts were assigned to map positions with JoinMap2.0 (Stam 1993) along with genomic marker data of the potato mapping population (Van Eck et al., 1995) or the Arabidopsis mapping data (Lister and Dean, 1993). Linkage group nomenclature is in agreement with existing Arabidopsis (Lister and Dean, 1993), potato and tomato maps (Bonierbale et al., 1989; Gebhardt et al., 1991). The maps were drawn with the graphical package MapChart (Voorrips 2000). Isolation and analysis of transcript fragments Twenty randomly chosen transcripts comprising absence/presence polymorphisms, intensity polymorphisms and monomorphic transcripts were sequenced. Fragments were excised from dried gel, put in 200 µl TE and left for 1 hour. From this solution, 5µl was taken for a 35 cycles PCR to re-amplify the fragment. The resulting DNA was diluted to 30 ng/µl and the nucleotide sequence was determined (Eurogentec, Seraing, Belgium). The sequences were analysed remotely (NCBI) for homology to data banks using the blast 2.0

54

The construction of genome wide transcriptome maps

programs (Altschul et al., 1997). The GenBank database 'nr' was used for Blastx analysis and the database ‘EST others’ for tBlastx. To assess the cause of absence/presence polymorphisms internal primers were designed on the following transcript fragments: AseIAA/TaqIGG-214e, AseIAA/TaqITC173c, AseIAA/TaqITC-158c and AseIAC/TaqIAA-217e. With these primers a 35 cycle PCR was performed on cDNA template of five samples that did have the transcript and five lacking the band on the cDNA-AFLP fingerprint. The result of this PCR was separated on a 3% agarose gel to determine if a fragment of the correct length was present or absent. To confirm that the map position of cDNA-AFLP markers matched the map position of a DNA sequence polymorphism, rather than the locus of a transcription factor, cDNAAFLP markers were converted to CAPS markers. The same primers were used to amplify genomic DNA from descendants of this C x E population. CAPS polymorphisms were obtained after digestion of the PCR product with a series of four cutter restriction enzymes including AseI and TaqI Acknowledgements We would like to thank the company Genetwister (Wageningen, NL) for the use of their phosphor imaging facility. We would also like to thank Yan Zifu for his assistance during RNA preparations. Maarten Koornneef is acknowledged for providing the Arabidopsis thaliana seeds of the Col × Ler mapping population.

55

Chapter 3

References Altschul, S., Madden, T., Schäffer, A., Zhang, J., Zhang, Z., Miller, W. and Lipman D. (1997) "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs". Nucl. Acids Res. 25: 3389-3402. Bachem, C., Van der Hoeven, R., De Bruijn, M., Vreugdenhil, D., Zabeau, M. and Visser, R. (1996) Visualization of differential gene expression using a novel method of RNA fingerprinting based on AFLP: Analysis of gene expression during potato tuber development. Plant J. 9: 745-753. Bachem, C., Oomen, R. and Visser R. (1998) Transcript imaging with cDNA-AFLP: A step by step protocol. Plant Mol. Biol. Rep. 16: 157-173. Bachem, C., Oomen, R., Kuyt, S., Horvath, B., Claassens, M., Vreugdenhil, D. and Visser R. (2000) Antisense suppression of a potato alpha-SNAP homologue leads to alterations in cellular development and assimilate distribution. Plant Mol. Biol. 443: 473-482. Bachem, C., Horvath, B., Trindade, L., Claassens, M., Davelaar, E., Jordi, W. and Visser, R. (2001) A potato tuber-expressed mRNA with homology to steroid dehydrogenases gibberellin levels and plant development. Plant J. 25: 595-604. Becker, J., Vos, P., Kuiper, M., Salamini, F. and Heun M. (1995) Combined mapping of AFLP and RFLP markers in barley. Mol. Gen. Genet. 249: 65-73. Buntjer J.B. (2000) Cross Checker: Computer Assisted Scoring of Genetic AFLP Data. Abstract Plant & Animal Genome VIII Conference., San Diego., CA., January 9-12., 2000. http://www.intlpag.org/pag/8/abstracts/pag8664.html. Bonierbale, M., Plaisted, R. and Tanksley, S.D. (1989) RFLP maps of potato and tomato based on a common set of clones reveal modes of chromosomal evolution. Genetics 120: 1095-1103. Cavalieri, D., Townsend, J. and Hartl, D. (2000) Manifold anomalies in gene expression in a vineyard isolate of Saccharomyces cerevisiae revealed by DNA microarray analysis. Proc. Natl. Acad. Sci. USA 22: 12369-12374. Chen, X., Salamini, F. and Gebhardt, C. (2001) A potato molecular-function map for carbohydrate metabolism and transport. Theor. Appl. Genet. 102: 284-295. Dellagi, A., Birch, P., Heilbronn, J., Lyon, G. and Toth, I. (2000) cDNA-AFLP analysis of differential gene expression in the prokaryotic plant pathogen Erwinia carotovora. Microbiology-Reading 146: 165-171. Durrant, W., Rowland, O., Piedras, P., Hammond-Kosack, K. and Jones, J. (2000) cDNA-AFLP reveals a striking overlap in race-specific resistance an wound response gene expression profiles. Plant Cell 12: 963-977. Gebhardt, C., Ritter, E., Barone, A., Debener, T., Walkemeier, B., Schachtschabel, U., Kaufmann, H., Thompson, R., Bonierbale, M., Ganal, M., Tanksley, S. and Salamini, F. (1991) RFLP maps of potato and their alignment with the homoeologous tomato genome. Theor. Appl. Genet. 83: 49-57. Jacobs, J., van Eck, H.J., Bastiaanssen, H., El-Kharbotly, A., Hertog-van ’t Oever, A., Arens, P., VerkerkBakker, B., te Lintel Hekkert, B., Pereira, A., Jacobsen, E. and Stiekema, W. (1995) A molecular map of potato from non-inbred parents including isozyme and morphological trait loci. Theor. Appl. Genet. 91: 289-300. Jansen, R.C. and Nap, J.P. (2001) Genetical genomics: the added value from segregation. Trends Genet. 17(7):388-391. Ku, H., Vision, T., Liu, J. and Tanksley S.D. (2000) Comparing sequenced segments of the tomato and Arabidopsis genomes: Large-scale duplication followed by selective gene loss creates a network of synteny. Proc. Natl. Acad. Sci. USA. 97:9121-9126.

56

The construction of genome wide transcriptome maps

Lister, C. and Dean, C. (1993) Recombinant inbred lines for mapping RFLP and phenotypic markers in Arabidopsis thaliana. Plant J. 4: 745-750. Murashige, T. and Skoog, F. (1962) A revised medium for rapid growth and bioassays with tobacco tissue cultures. Physiol. Plant. 15: 473-497. Ritter, E., Gebhardt, C. and Salamini, F. (1990) Estimation of recombination frequencies and construction of RFLP linkage maps in plants from crosses between heterozygous parents. Genetics 125: 645-654. Sambrook, J., Fritsch, E. and Maniatis, T. (1989) Molecular Cloning: A laboratory manual. Cold spring harbor., NY: Cold spring harbor laboratory press. Simons, G., van der Lee, T., Diergaarde, P., van Daelen, R., Groenendijk, J., Frijters, A., Büschges, R., Hollricher, K., Töpsch, S., Schulze-Lefert, P., Salamini, F., Zabeau, M. and Vos, P. (1997) AFLP-based fine mapping of the Mlo gene to a 30 kb DNA segment of the Barley genome. Genomics 44: 61-70. Stam, P. (1993) Construction of integrated genetic linkage maps by means of a new computer package: JoinMap. Plant J. 3: 739-744. Suárez, M., Bernal, A., Gutiérrez, J., Tohme, J. and Fregene, M. (2000) Developing expressed sequence tags (ESTs) from polymorphic transcript-derived fragments (TDFs) in cassava (Manihot esculenta Crantz). Genome 43: 62-67. The Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408: 796-815. van der Biezen, E., Juwana, H., Parker, J. and Jones, J. (2000) cDNA-AFLP display for the isolation of Peronospora parasitica genes expressed during infection in Arabidopsis thaliana. Mol. Plant-Microbe Interact. 13: 895-898. van Eck, H.J., Rouppe van der Voort, J.N.A.N., Draaistra, J., van Zandvoort, P., van Enckevort, E., Segers, B., Peleman, J., Jacobsen, E., Helder, J. and Bakker, J. (1995) The inheritance and chromosomal localization of AFLP markers in a non-inbred potato offspring. Mol. Breeding 1:397-410. Voorrips, R. (2000) MapChart version 2.0; Software for the graphical presentation of linkage maps and QTLs. http://www.JoinMap.nl. Vos, P., Hogers, R., Bleeker, M., Rijans, M., Van der Lee, T., Hornes, M., Frijters, A., Pot, J., Peleman, J., Kuiper, M. and Zabeau, M. (1995) AFLP: A new technique for DNA fingerprinting. Nucl. Acids Res. 23: 4407-4414. Vuylsteke, M., Mank, R., Antonise, R., Bastiaans, E., Senior, M., Stuber, C., Melchinger, A., Lübberstedt, T., Xia, X., Stam, P., Zabeau, M. and Kuiper, M. (1999) Two high-density AFLP linkage maps of Zea mays L.: analysis of distribution of AFLP markers. Theor. Appl. Genet. 99: 921-935. Wilson, R. (1999) How the worm was won., the C.elegans genome sequencing project. Trends Genet. 15: 51-58.

57

Chapter 3

58

Chapter 4 Genetic mapping and expression analyses of resistance gene loci in potato using NBS-profiling Bart Brugmans, Doret Wouters, Hans van Os, Ronald Hutten, Gerard van der Linden, Richard G.F,Visser, Herman J van Eck and Edwin van der Vossen

59

Chapter 4

60

NBS-profiling

Genetic mapping and expression analyses of resistance gene loci in potato using NBS-profiling Abstract NBS-profiling has proven to be a successful method for the identification of resistance gene analog (RGA) derived fragments. Here we report the use of NBS-profiling for the genome wide mapping of RGA loci in potato. NBS profiling analyses on a minimal set of F1 genotypes of the diploid mapping population previously used to generate the ultra dense (UHD) genetic map of potato, allowed us to efficiently map polymorphic RGA fragments relative to 10,000 existing AFLP markers. In total, 34 RGA loci were mapped, 18 of which originated from the SH parent and 16 from the RH parent. Of these mapped loci only 13 contained RGA sequences homologous to RGAs genetically positioned at approximately similar positions in potato or tomato. The remaining RGA loci mapped either at approximate chromosomal regions previously shown to contain RGAs in potato or tomato without sharing homology to these RGAs, or mapped at positions not yet identified as RGA-containing regions. In addition to markers representing RGAs with unknown functions, segregating markers were detected that were closely linked to four functional Rgenes that are known to segregate in the UHD mapping population. To explore the potential of NBS profiling in RGA expression analyses, RNA isolated from different tissues was used as template for NBS-profiling. Of all fragments amplified approximately 15% showed intensity or even absent/present differences between different tissues implying tissue specific R-gene expression. Absent/present differences between individuals were also found. In addition to being a powerful tool for generating candidate gene markers linked to R gene loci, NBS profiling, when applied to cDNA, can be instrumental in identifying those members of an R gene cluster that are expressed and therefore putatively functional. Introduction Plants are under constant attack from a great variety of pathogens. In defense, they have evolved an immune response that is for the greatest part governed by specificity determinants called resistance (R) genes. This simple yet sophisticated immune system involves an allele-specific genetic interaction between the host R gene and the pathogen avirulence (Avr) gene (Flor, 1971; Keen, 1990). Identification of numerous functional R genes from model and crop species has revealed that the majority of these genes encode cytoplasmic proteins with nucleotide binding site (NBS) and leucine rich repeat (LRR) domains and that they often belong to complex loci comprised of arrays of related genes (reviewed in Martin et al., 2003). Based on the genome sequences of Arabidopsis and rice

61

Chapter 4

(TAGI 2000; Goff et al., 2002; Meyers et al., 2002) the majority of plant genomes are estimated to contain hundreds of NBS-LRR genes. Conservation of several structural motifs within the NBS domain of plant R genes has prompted the development of homology-based approaches aimed at identification of structurally related sequences, termed R gene analogues (RGAs) (Kanazin et al., 1996; Yu et al., 1996; Leister et al., 1996; Aarts et al., 1998; Shen et al., 1998; Pan et al., 2000; van der Linden et al., 2004). Cosegregation of specific RGAs and R loci and/or quantitative trait loci (QTL) involved in disease resistance has been reported (Raman et al., 1999; Hunger et al., 2003; Kuhn et al., 2003), suggesting that NBS profiling can be a powerful tool for the development of markers linked to resistance loci. Although the use of degenerate primers to amplify new RGAs is useful to detect and clone R genes, this method is often laborious, involving the cloning and sequencing of the fragments, after which a polymorphism has to be identified before the fragment can genetically be mapped. Motif directed fingerprinting techniques which combine the advantage of a neutral marker system with a bias towards candidate genes are a better option. The use of degenerate primers that target NBS specific motifs in combination with adapter based amplification techniques generates complex fingerprinting patterns containing several RGA derived fragments (Hayes et al., 2000; van der Linden et al., 2004). By applying NBS based profiling techniques on individuals of an F1 mapping population, the genetic variation at RGA loci is sampled, resulting in the direct mapping of these fragments relative to other genetic markers or R loci that segregate in the mapping population (Calenge et al., 2005). By comparing the genetic position and sequence of these mapped fragments with sequences and/or map positions of known R-genes from potato and tomato (Leister et al., 1996; Pan et al., 2000; Gebhardt and Valkonen, 2001), new R-gene clusters or markers tightly linked to known resistances can be located. When using genomic DNA as template for NBS profiling, the identified RGA fragments will be derived from both functional and incomplete or pseudo genes. In contrast, when cDNA is used, all fragments amplified will be derived from expressed genes. Similar to DNA, cDNA can also be used to detect single nucleotide polymorphisms (SNP) making it possible to generate fragments and genetically map these fragments relative to a genetic map (Brugmans et al., 2002). NBS profiling on cDNA should give a set of fragments derived from expressed R genes, provided that the sensitivity of NBSprofiling is high enough to detect R genes in a complex mixture of genes. With cDNA as template, markers may thus be generated derived from active R genes. In addition, it may be possible to detect differences in R gene expression between tissues. To date, little is known about tissue specific expression of R genes. The fact that a single R-gene can interfere with pathogens that affect different tissues, as is demonstrated for the Mi resistance gene, which in tomato confers resistance to three species of root knot

62

NBS-profiling

nematodes (Meloidogyne spp.) as well as to the potato aphid Macrosiphum euphorbiae (Vos et al., 1998; Rossi et al., 1998; Milligan et al., 1998) and to both B- and Q-biotypes of whitefly Bemisia tabaci (Nombela et al., 2003), suggests that R-genes may be expressed in multiple tissues. Here we describe an application for NBS-profiling to generate R-gene derived fragments and genetically map these fragments relative to other markers of the Ultra High Density (UHD) map of potato (Isidore et al., 2003). By comparing the sequences and mapping positions of the fragments with known genes, the potential of NBS-profiling to generate fragments linked to known R-gene clusters and to detect new R-gene clusters is evaluated. Furthermore, differences in R-gene expression between tissues and between individuals is demonstrated by performing NBS-profiling on cDNAs generated from RNA from different tissues. Materials & Methods Plant material and DNA isolation For selective mapping purposes a subset of 29 genotypes were selected with MapPop (Vision et al., 1999) from a diploid mapping population consisting of 120 F1 progeny derived from a cross between the diploid parent genotypes SH83-92-488 (SH) and RH89039-16 (RH) (Rouppe van der Voort et al., 1997). This population was used to construct an ultra dense genetic map of potato comprising ~10,000 AFLP-markers divided over approximately 900 bins (Isidore et al., 2003; http://www.dpw.wageningen-ur.nl/uhd/). The genetic bins are defined by single recombination events and correspond to a genetic distance of 0.8 cM. For fingerprinting, leaf material from greenhouse plants was lyophilized and genomic DNA was isolated as described by Fulton et al. (1995) mRNA isolation and cDNA synthesis Total RNA was separately isolated from leaves, stems and roots from eight greenhouse plants derived from the cross between the diploid parent genotypes SH83-92-488 and RH89-039-16. After isolation, the RNA concentration was estimated by visual inspection on a 1% agarose gel. The Poly-A+ RNA fraction was extracted from 10µg of total RNA using poly-d[T]25V oligonucleotides coupled to paramagnetic beads (Dynal A.S. Oslo, Norway). First and second strand cDNA synthesis was carried out according to standard protocols (Sambrook et al., 1989). Of the final reaction mix 5µl was analyzed on a 1% agarose gel to estimate the final cDNA concentration. All enzymes used were purchased from Invitrogen (Breda, The Netherlands).

63

Chapter 4

NBS-profiling using genomic DNA NBS profiling on genomic DNA was carried out as described by van der Linden et al. (2004). The restriction enzymes MseI, RsaI or HaeIII were used for digestion of genomic DNA. The sequences of the NBS-specific primers used for the amplification of NBSspecific fragments are shown in Table 1a together with the corresponding annealing temperatures. Labeled PCR products were separated on a 6% polyacrylamide gel, and the individual fragments were visualized by autoradiography. NBS profiles were generated in duplicate for each of the 29 F1 genotypes and the two parental genotypes. Only marker bands that were reproducible in the duplicate samples were scored and added to the existing marker dataset of the UHD map. The relative genetic positions of each candidate RGA marker was calculated using BINMAP (van Os pers comm.) which allows the mapping of markers relative to the existing UHD map by comparing the marker data with all bin signatures. Table 1a. NBS-specific primer/enzyme combinations and number of polymorphic RGA bands

Primer Primer sequence

Ta Enzym Polymorphic Reliable # sequence RGAs

S1

GGTGGGGTTGGGAAGACAACG

50

MseI

24

12

4

S2

GGIGGIGTIGGIAAIACIAC

50

MseI

5

2

1

Ploop4 CCGGGITCAGGIAARACWAC

50

MseI

17

7

5

NBS2

GTWGTYTTICCYRAICCISSCAT

55

MseI

7

6

5

NBS2

GTWGTYTTICCYRAICCISSCAT

55

RsaI

5

4

4

NBS2

GTWGTYTTICCYRAICCISSCAT

55 HaeIII

5

2

1

NBS5a YYTKRTHGTMITKGATGAYGTITGG 55 NBS6 YYTKRTHGTMITKGATGATATITGG

MseI

5

4

4

NBS5a YYTKRTHGTMITKGATGAYGTITGG 55 NBS6 YYTKRTHGTMITKGATGATATITGG

RsaI

5

3

3

NBS5a YYTKRTHGTMITKGATGAYGTITGG 55 HaeIII NBS6 YYTKRTHGTMITKGATGATATITGG

6

5

5

KIN1

YTKRTTGTIYTIGATGATGTDTGG

55

MseI

15

12

5

KIN5

CTTGTMATITTGGATGATGTWTGG 55

MseI

9

8

6

NBS9

TGTGGAGGRTTACCTCTAGC

55

MseI

6

5

4

NBS9

TGTGGAGGRTTACCTCTAGC

55

RsaI

8

6

4

NBS9

TGTGGAGGRTTACCTCTAGC

55 HaeIII

2

2

2

55

15

12

7

134

90

60

GLPL4 CCCGAAGGAAACCRISRACWARA Total

64

MseI

NBS-profiling

NBS-profiling using RNA For detection of R-gene expression in different tissues and between individuals, NBSprofiling was performed using cDNA synthesized from mRNA isolated from leaves, roots and stems. NBS profiling with cDNA as template was carried out as described by van der Linden et al. (2004) for genomic DNA. The restriction enzymes MseI or TaqI were used for digestion of the cDNA. The sequences of the NBS-specific primers used for the amplification of NBS-specific fragments are shown in Table 1b together with the corresponding annealing temperatures. The RGA primers were used in combination with an IRD-labeled non selective TaqI or MseI primer to generate fragments which were visualized on a denaturing polyacrylamide gel using a NEN® IR2 DNA analyser (LI-COR® Biosciences, Lincoln, NE). Table 1b. NBS-specific primer/enzyme combinations and the number of fragments generated from the four different classes per primer combination. Class 1 are fragments amplified with similar intensity between individuals and tissues. Class 2 are fragments with an intensity difference between tissues. Class 3 are fragments with an absent/present difference between tissues and class 4 are fragements with an absent/present difference between individuals.

Primer

Primer sequence

Ta Enzym

# Class1

# Class2

# # Class3 Class4

Ploop1 GGIGGINTRGGIAARACRAC

50

MseI

22

2

1

1

Ploop1 GGIGGINTRGGIAARACRAC

50

TaqI

20

1

2

1

Ploop4 CCGGGITCAGGIAARACWAC

50

MseI

23

0

3

4

Ploop4 CCGGGITCAGGIAARACWAC

50

TaqI

21

2

2

3

KIN1

YTKRTTGTIYTIGATGATGTDTGG

55

MseI

19

2

1

1

KIN1

YTKRTTGTIYTIGATGATGTDTGG

55

TaqI

21

2

2

3

KIN5

CTTGTMATITTGGATGATGTWTGG

55

MseI

19

1

4

2

KIN5

CTTGTMATITTGGATGATGTWTGG

55

TaqI

17

1

2

3

GLPL5 CCKGARGGIRATCGKRRITTTCA

55

MseI

23

1

3

1

GLPL5 CCKGARGGIRATCGKRRITTTCA

55

TaqI

19

0

2

0

204

12

22

19

Total

Isolation and analysis of NBS fragments Fragments were excised from polyacryl amide gels using a sharp razor blade, eluted in TE for 5 min at 100°C, and reamplified with the NBS-specific primer and the adapter primer. PCR products were checked on agarose gels and purified with Qiaquick PCR purification spin columns (Qiagen). Fragments were either directly sequenced using the adapter primer as a sequencing primer or first cloned into the pGEM-T vector prior to sequencing with T7 or SP6 primers. Sequencing was carried out with the BigDye Terminator kit and an ABI 3700 automated sequencer from Applied Biosystems (USA). Sequences were

65

Chapter 4

identified by comparison with entries in the public protein and nucleotide databases using locally installed or remote BLASTX and BLASTN programs (Altschul et al.,1997). Results Genome-wide RGA mapping In this study we used both existing and newly designed NBS-specific primers (Table 1). For the design of the new primers, protein sequences of NBS regions of R genes and RGAs from potato, tomato and pepper were downloaded from existing remote sequence databases and aligned to each other. Degenerate primers were subsequently designed based on the DNA sequence alignments of conserved P-loop, kinase-2 and GLPL motifs within these sequences. Relative positions and orientation of the primers within the NBS are shown in Figure 1.

Figure 1. Relative positions and orientation of primers that target conserved motifs within the NBS region.

primer/enzyme combinations (Table 1a) on a subset of 29 genotypes from the diploid UHD mapping population as well as the two parental genotypes SH and RH. A total of 134 reproducible polymorphic fragments were subsequently scored. Reproducibility was illustrated by the fact that banding patterns of duplicate samples (plant material that was split before DNA extraction and processed in separately performed experiments) were identical For further characterization, all scored fragments were excised from the gel and analyzed by direct sequencing. Of the ninety fragments that produced a readable sequence, sixty showed significant similarity to known R genes and RGAs, verifying the RGA nature of the scored fragments. The fragments that were confirmed to be RGAderived and which showed segregation in the mapping population were mapped relative to

66

NBS-profiling

the existing UHD dataset, resulting in the genetic mapping of 34 RGA loci, 18 in SH and 16 in RH (Figure 2). Some loci, e.g. SH2.1, SH6.1, SH10.1, SH11.2 and SH11.3, correspond to loci previously described by Leister et al. (1996) in potato and by Pan et al. (2000) in tomato, both at the sequence level and at the approximate positional level (Table 2 and Figure 2). However, the majority represents either novel RGA loci or novel RGA sequences that map to positions that approximately correspond to those previously described (Figure 2). Novel RGA loci were identified on chromosome 1 (SH1.1 and RH1.3), chromosome 4 (RH4.1), chromosome 5 (SH5.1 and SH5.2) and chromosome 8 (SH8.1). Loci SH1.1, SH5.1 and SH5.2 share homology to Mi, and RH1.3 to I2 from tomato. RH4.1 and SH4.1 share homology to RGA sequences present on a tomato BAC clone (AF411807L; van der Hoeven et al. 2002) and SH8.1 to a putative disease resistance protein (Table 2). Examples of novel RGA sequences which approximately map to previously described loci are RH1.1B/C, SH7.1, SH9.1, SH9.3, SH12.1A and SH12.1.B (Figure 2A). The sequences mapped to RH1.1B/C, SH7.1, SH9.1, SH9.3 and SH12.1A/B share no homology to the syntenous loci St124, Q173, Tm-2, Sw5 or Q99, respectively.

Figure 2A

67

Chapter 4

Figure 2B Figure 2. Relative positions of putative RGA loci in the UHD map of potato (A. SH map. B. RH map). Each chromosome is divided into BINS containing varying numbers of cosegregating AFLP markers, indicated by the degree of grey shading (white is 0 and black is >500). Bars to the right of each chromosome indicate the relative positions of putative RGA loci. Positions of R genes Sen1-4; H1; R3a/b; Gpa2/Rx known to segregate in the UHD-population originating from parent SH, the positions of Tm-1; Q99; Mi; Q173; Tm-2; Sw5; Q133; I2; Q136 from tomato and the positions of R2; Rpi-blb3; Rpi-abpt; Rpi-blb2; Gro1; Rpi-blb1; Gpa6; Sen1; Gpa3; RYsto; Rmc1; St124; R1 from potato are indicated to the left of the chromosomes. Finally all novel RGA loci or novel RGA sequences at known genetic R-gene positions are encircled.

The potential of the NBS profiling technique for identifying markers linked to functional genes is very well exemplified by the fact that we identified RGA markers linked to all the functional R genes currently mapped in the SHxRH population using only a limited set of RGA specific primers. On chromosome 5, SH5.2 corresponds to the same genetic interval as the nematode resistance locus H1 which in SH has been mapped to SHBIN63 (Bakker et al., 2004). On chromosomes 11 and 12, SH11.3 and SH12.2 correspond to intervals that harbor the late blight resistance genes R3a and R3b (SHBIN65; Huang et al., 2004) and the nematode resistance gene Gpa2 (SHBIN67; Rouppe van der Voort et al., 1999; van der Vossen et al., 2000), respectively. Moreover,

68

NBS-profiling

alignment of the BIN maps of SH4 and RH4 reveals that RH4.2 corresponds to the genetic interval on SH4 to which the wart disease resistance locus Sen1-4 has been mapped (SHBIN37-41; Brugmans et al., 2005). Although Sen1-4 is derived from SH, BIN numbers of RH4 correspond well with those of SH4, as is illustrated by the putative positions of the centromere (BIN35 in RH4 and BIN31 in SH4). Interestingly, locus SH4.2 corresponds both at the nucleotide level and the positional level to Q136 from tomato which shares high homology to I2C-2 (Table 2). Table 2. NBS-profile bands with significant identity to known resistance (R) genes and RGA cluster members. Locus

BIN (interval)

Primer/enzyme

Homologue Accession

Annotated

Identity-DNA

GLPL4/MseI

LEU81378

Mi-1.1

94% (137)

SH1.1

73-76

SH2.1

1-6

PLOOP4/MseI

AF404437

LeQ99/St124

83% (382)

SH4.1

16-18

NBS5a6/RsaI

AF411807

LeBAC127E11

84% (246)

SH5.1

57

NBS5a6/HaeIII

LEU81378

Mi-1.1

90% (53)

SH5.2

63

GLPL4/MseI

AF091048

Mi-1.1

81% (94)

SH6.1

1

NBS5a6/HeaIII

LEU81378

Mi-1.1

86% (224)

SH7.1

85-87

GLPL4/MseI

AF039681

Mi-1.1

83% (133)

SH8.1

57-70

Kinase5/MseI

AC091238

OzRGA

51% (35aa)

SH9.1

17-19

Kinase1/MseI

AF004879

I2C-2

93% (113)

SH9.2

20-23

NBS9/MseI

AC249448

Rx2

93% (207)

SH9.3

70-82

NBS9/HaeIII

BQ113799

StEST599375

94% (53)

SH10.1

26

Kinase5/MseI

AF404451

LeQ133

94% (39)

SH11.1

1-2

Kinase5/MseI

AY426260

BlbRGA3

97% (41)

SH11.2

60

NBS2/MseI

AF408704

I2C-5

89% (157)

SH11.3

64-66

NBS2/MseI

AF004878

I2C-1

89% (29)

SH12.1A

54-56

Kinase5/MseI

AF447489

R1

80% (107)

SH12.1B

54-56

Kinase5/MseI

AY426261

BlbRGA3

97% (42)

SH12.2

61-67

NBS9/RsaI

AJ249449

GPA 2

96% (88)

69

Chapter 4

Table 2 Homologue Locus

BIN (interval)

Primer/enzyme Accession

Annotated

Identity-DNA

RH1.1A

13-17

S1/MseI

AF404437

LeQ99 (St124)

89% (107)

RH1.1B

13-17

PLOOP4/MseI

AF447489

R1

93% (265)

RH1.1C

13-17

PLOOP4/MseI

LEU65667

Mi

93% (252)

RH1.2

29-30

NBS9/MseI

AF266747

RGC1

82% (272)

RH1.3

101

PLOOP4/MseI

AF004878

I2C-1

92% (156)

RH2.1

2-11

S1/MseI

AY187296

MeRCa6

65% (32aa)

RH4.1

17

NBS2/RsaI

AF411807

LeBAC127E11

78% (157)

RH4.2

34-39

Kinase1/MseI

AF404454

LeQ136 (I2C-2)

93% (182)

RH5.1

20-23

NBS5a6/HaeIII

AF447489

R1

96% (258)

RH6.1

3-6

Ploop/MseI

AF039681

Mi-1.1

85% (170)

RH8.1

17-19

NBS9/RsaI

AF195939

Gpa2

79% (309)

RH9.1

78

NBS9/HaeIII

BQ113799

StEST599375

94% (53)

RH10.1

1-5

Kinase5/MseI

AF404437

LeQ99 (St13)

91% (171)

RH11.1

52-64

Kinase1/MseI

AF404456

LeQ138 (I2C-1)

93% (64)

RH11.2

82-83

S1/MseI

STU60069

St11

84% (125)

RH11.3

84-86

NBS5a6/RsaI

AF004878

I2C-1

95% (48)

cDNA-01

Class 4*

Kinase1/TaqI

AF404434

LeQ95

90% (369)

cDNA-02

Class 2*

Kinase1/TaqI

AJ457050

Hero3

89% (331)

cDNA-03

Class 3*

Kinase5/TaqI

STU60074

St125

99% (249)

cDNA-04

Class 1*

Kinase1/TaqI

AR29071

BlbRGA3

48% (56)

cDNA-05

Class 4*

Kinase1/TaqI

STU60069

St11

85% (95)

cDNA-06

Class 1*

PLOOP1/TaqI

LE25SRIB

Tomato 25 S

96% (129)

cDNA-07

Class 4*

PLOOP1/TaqI

AF534298

LhS2_410

83% (75)

cDNA-08

Class 2*

PLOOP4/TaqI

AJ716167

Sc_TNBS1-45

98 % (112)

cDNA-09

Class 4*

Kinase1/TaqI

AF516615

FRGA-A30

96% (350)

cDNA-10

Class 3*

Kinase5/TaqI

AF404437

LeQ99

93% (71)

cDNA-11

Class 3*

Kinase5/TaqI

AF404431

LeQ88

72% (58)

cDNA-12 Class 1* Kinase5/TaqI AAF04603 Gpa2 * fragments not mapped, instead indicated to which class the fragment belongs

70

58% (43)

NBS-profiling

NBS-profiling using cDNA The results presented in this paper clearly show the potential of NBS profiling in producing markers in RGA sequences, and in both known resistance loci and new putative resistance loci. However, it is not clear whether the markers actually target functional Rgenes. For isolation and cloning of a functional R-gene, NBS profiling on genomic DNA represents only a first step. However, it should be possible to exclude the markers in nonfunctional (pseudo) genes by not using genomic DNA, but cDNA as a template for NBS profiling. In an attempt to validate this idea NBS profiling was performed on cDNA generated from RNA derived from different tissues. with a total of ten primer/enzyme combinations (Table 1b) were tested on a subset of 8 genotypes from the diploid UHD mapping population. Typical profiling patterns, comprising 20-35 bands, obtained with leaf, root or stem tissue specific cDNAs, are shown in Figure 3.

Figure 3. A section of a representative NBS profiling Li-cor image using Kin1/TaqI as primer/enzyme combination. The banding pattern was generated from cDNA of leaves (I) stems (II) and roots (III) of three different individuals of an F1 mapping population. A, no intensity polymorphisms between tissues or individuals (class 1); B, intensity polymorphism between tissues (class 2); C, present/absent polymorphism between tissues (class 3); D, absent/present polymorphism between individuals (class 4).

71

Chapter 4

The majority of the fragments (204) amplified using the ten primer/enzyme combinations were monomorphic (class 1) and did not show marker variation in intensity between genotypes or between the different tissues. Based on differences in expression in the analyzed tissues and between genotypes, the remaining fragments can be grouped into three different classes (Fig 3). First there are intensity polymorphisms between the different tissues derived from one genotype (class 2). Absent/present polymorphisms between the different tissues, while the expression between individuals is similar form class 3. The last class contains absent/present polymorphisms between genotypes while expression is present in the same tissues (class 4). In total 53 fragments showed segregation. Of these, 22 showed clear absent/present differences between tissues and 19 were absent/present polymorphisms between genotypes. The other 12 were intensity differences between tissues (Table 1b). For further characterization 19 fragments including at least two fragments of each class were excised from the gel and analyzed by direct sequencing. Of the 12 bands that produced a readable sequence, apart from one all showed significant similarity to known R genes and RGAs, confirming the RGA nature of the majority of the fragments (Table 2). Discussion Sequence information generated through large scale genome and EST sequencing efforts has lead to the development of candidate gene based marker technologies. One of these applications is the NBS profiling technique (van der Linden et al., 2004) which specifically targets RGAs. In the current study we have used (degenerate) NBS specific primers to amplify a multi-locus RGA marker pattern from both genomic (g)DNA and cDNA. To verify the origin of the amplified fragments, a total of 134 gDNA derived and 19 cDNA derived fragments were sequenced, of which 90 gDNA and 12 cDNA derived fragments gave a readable sequence. Of these 102 sequences, 60 gDNA and 11 cDNA derived fragments shared high homology with R gene or RGA specific sequences confirming that the majority of the amplified fragments were truly derived from RGAs. This is in agreement with the findings of van der Linden et al. (2004) who found that the majority of the fragments amplified by NBS profiling using different potato genotypes were RGA derived. By combining the marker data of the 60 segregating RGA fragments with the data of the ultra dense genetic map of potato, it was possible to genetically map these markers relative to 10,000 AFLP markers which make up the UHD map of potato. By using only a subset (n=29) of the diploid F1 SHxRH mapping population the accuracy of the map position was reduced to an average interval of five Bins (Table 2). Due to the lack of marker data for individuals that show recombination within this interval, more accurate mapping is not possible. Nevertheless, the resolution is sufficient to indicate the approximate genetic region in which the marker and thus an R gene locus is located.

72

NBS-profiling

Furthermore, it is known which genotypes of the UHD mapping population have undergone recombination within any interval of interest and thus can be used to increase the resolution within the interval. In a comparative study of genomic organization of R genes and RGAs in three solanaceous crop genera, tomato, potato and pepper, Grube et al. (2000) observed, in contrast with the findings of Leister et al. (1998) for Gramineae, significant structural (sequence) conservation of R gene loci, despite limited positional correspondence of phenotypically defined genes conferring resistance to related or identical pathogens. This suggests that the chromosomal locations of R gene clusters is broadly conserved through speciation, and that comparative genomics can be an instrument for rapid identification of genes that are structurally similar to those already mapped in related genera. Our results indicate that, although many R gene clusters are indeed conserved between potato and tomato, many may be part of heterogeneous superclusters which harbor more than a single RGA family. For Arabidopsis, it was reported that ~10% of the NBS-LRR clusters contained NBS-LRR genes of diverse subgroups but that these clusters are likely the result of random associations among the 149 NBS-LRR-encoding genes in the Arabidopsis Col-0 genome (Meyers et al., 2003). For tomato Pan et al. (2000) also found genetic linkage between NBS containing R-gene sequences from different origin. The genetic mapping of these sequences in tomato was based upon a set of inbred lines giving a low marker resolution and thus a large genetic interval in which the markers could be placed. Although the genetic resolution of our mapping in the UHD map is higher compared to the study by Pan et al. (2000), it is still too low to draw conclusions on the physical clustering of NBS-LRR genes from different origin from our NBS profiling mapping data. Functional R genes are expected to be continuously expressed in the tissues that might be infected by a pathogen. Therefore NBS profiling was performed using cDNA derived from RNA extracted out of plants that were not inoculated or triggered towards a defense response to detect functional R genes. Although few R genes have been shown to be induced upon pathogen infection, R gene related ESTs have in some cases been identified only in pathogen challenged libraries (Ronning et al., 2003). Expression levels of the target genes are therefore expected to be low, which could lead to problems related to PCR kinetics and sensitivity (Vos et al., 1996). However, when using the standard NBSprofiling protocol as developed by van der Linden et al. (2004), between 20 and 35 fragments were amplified, implying a high sensitivity for the NBS-profiling technique. Furthermore, similar tissues were analyzed from different genotypes, making it possible to detect tissue-specific expression and compare tissue specific fragments between genotypes. Also fragments that segregated between genotypes could be evaluated for their reproducibility between tissues. The result that no major differences were detected

73

Chapter 4

within or between the genotypes depending on the class of expression, confirms the reproducibility of the technique. This is in agreement with the results found for DNA whereby NBS-profiling was performed twice upon the same DNA. As is possible when profiling gDNA, it is also possible to detect absent/present polymorphisms when profiling cDNA and to genetically map these markers (Brugmans et al., 2002). cDNA based NBS-profiles showed some clear absent/present polymorphisms between F1 genotypes. However, because the number of F1 genotypes used was only eight, it was not possible to genetically map the generated absent/present polymorphisms relative to the complete population (n=130) or the subset (n=29) used for the mapping of DNA derived RGA-fragments. When this NBS profiling is repeated using more individuals of this population, it will be possible to genetically map the polymorphic fragments relative to the genetic markers of the ultra dense mapping population leading to the identification of expressed R gene clusters. For the fragments that are differentially expressed between tissues, sequence information can help to develop SCAR or RFLP markers, which can be used for the genetic mapping and possible cloning of the tissue specific R genes. By comparing the complete sequences of the different tissue specific RGa’s with each other and with the RGA’s expressed in all tissues might lead to a better understanding of the functional regions and the mechanisms underlying the resistance response against the different types of pathogens. The gene-for-gene interaction between the R gene of a plant and the Avr gene of the pathogen is thought to be highly specific. For some R-genes, for example Mi, it was found that the same R-gene confers resistance against different pathogens (Vos et al., 1998, Nombela et al., 2003) implying that R-genes can function in different tissues. This assumption is supported by the findings of van der Vossen et al. (2000) who reported about a resistance-gene cluster in potato containing genes with high homologies, but resistance to distinct pathogens affecting different plant tissues. Our results indicate that many R-genes are in fact expressed in multiple tissues. The majority of the amplified fragments in the NBS profiles with cDNA from different tissues were amplified in all tissues examined. Still ten percent of the fragments were amplified only in one or two tissues. Approximately five percent of the fragments gave clear intensity differences between tissues. In total approximately 15 percent of the NBS profiling fragments showed expression differences between tissues implying differences in specificity of the R gene. Tissue-specific expression differences may underlie the tissue-specific resistance reactions against the same pathogen (e.g. Phytophthora resistance in tubers and leaves). NBS profiling provides a tool with which the genes involved in this reaction can be identified and located. The tissue specificity of some R-genes also indicates that the promoter used to induce expression of an R-gene after transformation can have its effect

74

NBS-profiling

on the phenotype and thus the results of the complementation test. Therefore it might be advisable to try and simulate nature as much as possible and use a proper promoter. In the search for markers linked to resistance genes both RNA and DNA can be used in combination with NBS-profiling resulting in a high number of RGA derived fragments. For both RNA and DNA, fragments can be generated that show segregation between individuals which can be mapped relative to other markers of an existing genetic map. The number of verifiable RGA derived fragments from RNA was 11 out of 12 whereas for DNA 60 out of 90 sequenced fragments were confirmed to be RGA derived. Although this suggest that DNA is more sensitive for miss-priming resulting in the amplification of polymorphic DNA sequences that are not derived from RGA’s, this conclusion has to be verified by analyzing equel amounts of cDNA and gDNA derived fragments from the same primer/enzyme combination. The only cDNA fragment which was not RGA derived was a fragment that was amplified out of all samples and all tissues with the same intensity and appeared to be ribosomal RNA derived. For the detection and cloning of specific R-genes, the most suitable template is cDNA in comparison with gDNA due to the fact that cDNA only contains the functional genes and thus might will lead to a fragment derived from a functional gene and might even lead to the gene of interest within a cluster of genes. Further it might also be possible to amplify R-gene derived fragments using NBS-profiling by using cDNA isolated from tissue challenged for a specific R-gene reaction, leading directly to the R-gene of interest. On the other hand, gDNA as template is easier to handle and the average percentage of polymorphic fragments found using gDNA is much higher than for cDNA, therefore gDNA is a better option to use as template for genome wide mapping of RGA’s and RGA rich regions. Also for the detection of markers closely linked to an R-gene but not necessarily derived from the gene itself (e.g. to use for QTL analysis or marker assisted selection in a breeding program) gDNA is more suitable as template than cDNA. Irrespective of the choice of template, NBS-profiling is a good option for the generation of markers linked to RGA’s.

75

Chapter 4

References Altschul S, Madden T, Schäffer A, Zhang J, Zhang Z, Miller W, Lipman D (1997) "Gapped BLAST and PSIBLAST: a new generation of protein database search programs". Nucl. Acids Res. 25: 3389-3402. Bakker E, Achenbach U, Bakker J, van Vliet J, Peleman J, Segers B, van der Heijden S, van der Linden P, Graveland R, Hutten R, van Eck H, Coppoolse E, van der Vossen E, Bakker J, Goverse A (2004). A high-resolution map of the H1 locus harbouring resistance to the potato cyst nematode Globodera rostochiensis. Theor.Appl.Genet. 109:146-152. Brugmans B, Fernandez del Carmen A, Bachem CWB, van Os H, van Eck HJ, Visser RGF (2002). A novel method for the construction of genome wide transcriptome maps. Plant J 31:211-222. Brugmans B, Hutten RGB, Rookmaker NO, Visser RGF van Eck HJ (2005) Exploitation of a marker dense linkage map of potato for positional cloning of a wart disease resistancfe gene. Theor. Appl. Genet. In press Calenge F, van der Linden CG, van de Weg E, Schouten HJ, van Arkel G, Denancé C, Durel CE (2005) Resistance gene analogues identified through the NBS-profiling method map close to major genes and QTL for disease resistance in apple Theor. Appl. Genet. 110:660 - 668 Dodds PN, Lawrence GJ, Ellis JG (2001). Contrasting modes of evolution acting on the complex N locus for rust resistance in flax. Plant J 27:439-453 Flor H (1971) Current status of the gene-for-gene concept. Annu Rev Phytopathol 9:275-296 Fulton TM, Chunwongse J, Tanksley SD (1995) Microprep protocol for extraction of DNA from tomato and other herbaceous plants. Plant Mol Biol Rep 13:207-209 Gebhardt C, Valkonen JPT (2001) Organization of genes controlling disease resistance in the potato genome. Annu.Rev.Phytopathol 39:79-102 Goff SA, Ricke D, Lan TH, Presting G, Wang R, Dunn M, Glazebrook J, Sessions A, Oeller P, Varma H, Hadley D, Hutchison D, Martin C, Katagiri F, Lange BM, Moughamer T, Xia Y, Budworth P, Zhong J, Miguel T, Paszkowski U, Zhang S, Colbert M, Sun WL, Chen L, Cooper B, Park S, Wood TC, Mao L, Quail P, Wing R, Dean R, Yu Y, Zharkikh A, Shen R, Sahasrabudhe S, Thomas A, Cannings R, Gutin A, Pruss D, Reid J, Tavtigian S, Mitchell J, Eldredge G, Scholl T, Miller RM, Bhatbagar S, Adey N, Rubano T, Tusneem N, Robinson R, Feldhaus J, Macalma T, Oliphant A, Briggs S (2002) A draft sequence of the rice genome (Oryza sativa L. ssp. Japonica. (Science 296:92-100) Hayes AJ, Saghai Maroof MA (2000) Targeted resistance gene mapping in soybean using modified AFLPs. Theor. Appl. Genet. 100: 1279-1283 Hammond-Kosack KE, Jones JDG (1997) Plant disease resistance genes. Ann Rev. Plant Physiol. Plant Mol. Biol. 48: 575-607. Huang S, Vleeshouwers VGAA, Werij JS, Hutten RCB, van Eck HJ, Visser RGF, Jacobsen E (2004). The R3 Resistance to phytophthora infestans in potato is conferred by two closely linked R genes with distinct specificities. Plant-Micr. Interact. 17:428-435. Hunger S, Di Gaspero G, Möhring S, Bellin D, Schafer-Pregel R, Borchardt DC, Durel C, Werber M, Weisshaar B, Salamini F, Schneider K (2002) Isolation and linkage analysis of expressed diseaseresistance gen analogues of sugar beet (Beta vulgaris L.) Genome 46:70-82 Isidore E, Van Os H, Andrzejewski S, Bakker J, Barrena I, Bryan GJ, Buntjer J, Caromel B, Van Eck HJ, Ghareeb B, De Jong W, Van Koert P, Lefebvre V, Milbourne D, Ritter E, Rouppe van der Voort JNAM, Rousselle-Bourgeois F, Van Vliet J, Waugh R (2003) Toward a marker-dense meiotic map of the potato genome: lessons from linkage group I. Genetics 165(4):2107-2116. Kanazin V, Marek LF, Shoemaker RC (1996) Resistance gene analogs are conserved and clustered in soybean. Proc. Natl Acad. Sci. USA 93: 11746-11750

76

NBS-profiling

Keen NT (1990) Gene-for-gene complementarity in plant-pathogen interactions. Annu. Rev. Genet. 24, 447463 Kuhn DN, Heath M, Wisser RJ, Meerow A, Brown JS, Lopes U, Schnell RJ (2003) Resistance gene homologues in Theobroma cacao as useful genetic markers. Theor Appl Genet 107:191-202 Leister D, Ballvora A, Salamini F, Gebhardt C (1996) A PCR-based approach for isolating pathogen resistance genes from potato with potential for wide application in plants. Nat. Genet. 14: 421-429 Martin GB, Bogdanove AJ, Sessa G (2003) Understanding the function of plant disease resistance proteins. Annu.Rev.PlantBiol 54:23-61 Meyers BC, Chin DB, Shen KA, Sivaramakrishnan S, Lavelle DO, Zhang Z, Mitchelmore RW (1998) The major resistance gene cluster in lettuce is highly duplicated and spans several megabases. Plant Cell 10:1817-1832 Meyers BC, Dickerman AW, Mitchelmore RW, Sivaramakrishnan S, Sobral B, Young ND (1999) Plant disease resistance genes encode members of an ancient and diverse protein family within the nucleotide-binding superfamily. Plant J 20:317-332 Meyers BC, Morgante M, Michelmore RW (2002) TIR-X and TIR-NBS proteins: two new families related to disease resistance TIR-NBS-LRR proteins encoded in Arabidopsis and other plant genomes. Plant J 32:77-92 Meyers BC, Kozik A, Griego A, Kuang H, Michelmore RW (2003). Genome-wide analysis of NBS-LRRencoding genes in Arabidopsis. Plant Cell 15:809-834. Nombela G, Williamson V, Muniz M (2003) The root-knot nematode resistance gene Mi-1.2 of tomato is responsible for resistance against the whitefly Bremisia tabaci MPMI 16: 645-649. Pan Q, Liu Y, Budai-Hadrian O, Sela M, Carmal-Goren L, Zamir D, Fluhr R (2000). Comparative genetics of nucleotide binding site-leucine rich repeat resistance gene homologues in the genomes of two dicotyledons: Tomato and Arabidopsis. Genetics 155: 309-322. Raman R, Ash G, Wratten N, Raman H (1999) Comparative analysis of resistance gene analogs in Brassica napus L. proc. Rapeseed congress Ronnings CM, Stegalkina SS, Ascenzi RA, Bougri O, Hart AL, Utterbach TR, Vanaken SE, Riedmuller SB, White JA, Cho J, Pertea GM, Lee Y, Karamycheva S, Sultana R, Tsai J, Quackenbush J, Griffiths HM, Restrepo S, Smart CD, Fry WE, van der Hoeven R, Tanksley S, Zhang P, Jin H, Yamamoto ML, Baker BJ, Buell CR (2003) Comparative analyses of potato expressed sequence tag libraries. Plant Physiol. 131:419-429 Rouppe van der Voort J, Wolters P, Folkertsma R, Hutten R, Van Zandvoort P, Vinke H, Kanyuka K, Bendahmane A, Jacobsen E, Janssen R, Bakker J (1997) Mapping of the cyst nematode resistance locus Gpa2 in potato using a strategy based on comigrating AFLP markers. Theor Appl Genet 95 : 874– 880 Sambrook J, Fritsch E, Maniatis T (1989) Molecular Cloning: A laboratory manual. Cold spring harbor., NY: Cold spring harbor laboratory press Saraste M, Sibbald PR, Wittinghofer A (1990). P-loop: a common motif in ATP- and GTP- binding proteins. Trends Biochem Sci 15:430-434 Shen KA, Meyers BC, Islam/Faridi MN, Chin DB, Stelly DM, Mitchelmore RW (1998) Resistance gene candidates identified by PCR with degenerate oligonucleotide primers map to clusters of resistance genes in lettuce. Mol Plant Microbe Interact 11:815-823 Steward CN, Via LE (1993) A rapid CTAB DNA isolation technique useful for RAPD fingerprinting and other PCR applications. Biotechniques 14: 748-750.

77

Chapter 4

The Arabidopsis Genome Initiative (TAGI) (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 6814:796-815 Traut TW (1994) The functions and conserved motifs of nine types of peptide segments that form different types of nucleotide-binding sites. Eur J Biochem 229: 9-19 van der Linden G, Wouters D, Mihalka V, Kochieva E, Smulders M, Vosman B (2004) Efficient targeting of plant disease resistance loci using NBS profiling. Theor Appl Genet 109 : 384–393 van der Vossen E, van der Voort J, Kanyuka K, Bendahmane A, Sandbrink H, Baulcombe D, Bakker J, Stiekema W, Klein-Lankhorst R (2000) Homologues of a single resistance-gene cluster in potato confers resistance to distinct pathogens: a virus and a nematode. Plant J 23: 567-576 Vision TJ, Brown DG, Shmoys DB, Durrett RT, Tanksley SD (1999) Selective mapping: a strategy for optimizing the construction of high-density linkage maps. Genetics 155:407-420. Vos P, Simons G, Jesse T, Wijbrandi J, Heinen L, Hogers R, Frijters A, Groenendijk J, Diergaarde P, Reijans M, Fierens-Onstenk, de Both M, Peleman J, Liharska T, Hontelez J, Zabeau M (1998) The tomato Mi-1 gene confers resistance to both root-knot nematodes and potato aphids. Nature biotech. 16:1365-1369 Yu YG, Buss GR Saghai Maroof MA (1996) Isolation of a superfamily of candidate disease-resistance genes in soybean based on a conserved nucleotide-binding site. Proc.Natl.Acad.Sci. USA 93:11751-11756

78

Chapter 5 A new and versatile method for the successful conversion of AFLP markers into simple single locus markers Bart Brugmans, Ron G. M. van der Hulst, Richard G. F. Visser, Pim Lindhout and Herman J. van Eck Slightly modified from Nucleic Acid Research, 2003, Vol.31, No.10 e55

79

Chapter 5

80

Conversion of AFLP markers into simple single locus markers

A new and versatile method for the successful conversion of AFLP markers into simple single locus markers Abstract Genetic markers can efficiently be obtained by using amplified fragment length polymorphism (AFLP) fingerprinting because no prior information on DNA sequence is required. However, the conversion of AFLP markers from complex fingerprints into simple single locus assays is perceived as problematic because DNA sequence information is required for the design of new locus-specific PCR primers. In addition, single locus polymorphism (SNP) information is required to design an allele specific assay. This paper describes a new and versatile method for the conversion of AFLP markers into simple assays. The protocol presented in this paper offers solutions for frequently occurring pitfalls and describes a procedure for the identification of the SNP responsible for the AFLP. By following this approach, a high success rate for the conversion of AFLP markers into locus-specific markers was obtained. Introduction Amplified fragment length polymorphism (AFLP) is a PCR based multi-locus fingerprinting technique, which efficiently identifies DNA polymorphisms without prior information on the DNA sequence of the organism(s) (Vos et al., 1995). AFLP relies on the selective amplification of a subset of DNA fragments from a more complex template pool that has been generated by ligation of adapters to restriction fragments. The advantages of AFLP are: high reproducibility (Jones et al., 1997), high PCR multiplex ratio, amenable at any genome complexity, the possibility to generate a virtually infinite number of markers and the fact that no prior sequence information is required. This is shown by the more than 1200 papers in which AFLP technology is used for all kinds of applications like genetic diversity analysis, local marker saturation, construction of genetic maps and quantitative trait loci (QTL) mapping in fungi (Majer et al., 1996), insects (Yan et al., 1999), plants (Becker et al., 1995; Meksem et al, 1995; van Eck et al., 1995; Hill et al., 1996; Hongtrakul et al., 1997; Nandi et al., 1997; Simons et al., 1997; Haanstra et al., 1999; Jeuken et al., 2001; Stirling et al., 2001) and animals (Otsen et al., 1996; Ransom and Zon, 1999; Ovilo et al., 2000; Giannasi et al., 2001). For single locus assays, AFLP markers are less suitable (e.g. allele frequency studies, marker-assisted selection or map-based cloning). Although AFLP markers can be used for these applications (Liu et al.,1998), many AFLP markers are redundant and hence too expensive and too laborious for large-scale single locus screenings. Due to this, there is a strong need to convert specific AFLP markers into single locus PCR markers, such as cleaved amplified polymorphic site (CAPS) (Konieczny

81

Chapter 5

and Ausubel, 1993) markers or sequence characterised amplified region (SCAR) (Paran and Michelmore, 1993) markers, for these marker techniques are easy to use, less laborious and inexpensive for simple locus assays. Therefore, it is very important to have a reliable and efficient protocol for conversion of AFLP markers into high throughput single locus PCR markers. However, in contrast to AFLP, which can be applied immediately in any organism, the design of new PCR primers for a locus-specific assay does require information on the DNA sequence of the AFLP band. Preferably, the conversion of AFLP markers also aims to design a marker that can distinguish between different alleles. Often, sequencing of the existing alleles is required to identify allele-specific single locus polymorphisms (SNPs). Although marker conversion seems technically easy, some hurdles need to be taken. The first hurdle is the extraction of an AFLP fragment from a polyacrylamide gel. Often, these extracts contain multiple fragments, which are the result of co isolation of background amplification products of the AFLP fragment of interest (Meksem et al., 2001). The second hurdle is the relative short size of AFLP bands. The resulting DNA sequence is often too short to optimally design PCR primers, and too short to expect internal polymorphisms, which can be used to differentiate between alleles (Bradeen and Simon, 1998; Shan et al., 1999; Wei et al., 1999; Negi et al., 2000). These steps during marker conversion severely reduce the efficiency of current protocols in which only a minority of the AFLPs was successfully converted into a locus- and/or allele-specific assay. Here we present a protocol that integrates various strategies in an optimal order. This step by step protocol guarantees successful marker conversion of virtually every AFLP marker. Materials and methods A diploid potato mapping population descending from the cross SH83-92-488 3 RH89-03916 (Rouppe van der Voort et al., 1997) was used to confirm that the genetic map position of the single locus assay was identical to the results obtained for the original AFLP marker. Plant DNA was isolated essentially according to Steward and Via (1993), adjusted for 96well format using 1 ml tubes of Micronics (Micronic BV, Lelystad, The Netherlands). Leaf tissue was ground using a Retsch 300 mm shaker at maximum speed (Retsch BV, Ochten, The Netherlands). Template preparation and AFLP fingerprinting were essentially performed as described in Vos et al. (1995). For conversion of AFLP markers into single locus markers, we choose AFLP fragments of different sizes (between 100 and 400 bp) from template prepared with EcoRI/MseI as well as with PstI/MseI. For the determination of the fourth, fifth and sixth selective nucleotide following the AFLP-restriction site (throughout this text referred to as AFLP-mediated mini-sequencing) a generalised set of 12 degenerated primers (Table 1) was used for AFLP fingerprinting using 100 times diluted +3/+3 pre

82

Conversion of AFLP markers into simple single locus markers

amplified product as template and the standard primer concentrations of 0.5 pmol 33P or Fluorescently (IRDye 700) labelled EcoRI or PstI primer and 3 pmol MseI primer per reaction. Table 1: The generalised set of twelve primers for AFLP mediated mini-sequencing. These primers will provide DNA sequence information of three more bases adjacent to the first three selective nucleotides of the MseI primer.

primer name

Sequence

3N+A

GATGAGTCCTGAGTAA NNNA

3N+C

GATGAGTCCTGAGTAA NNNC

3N+G

GATGAGTCCTGAGTAA NNNG

3N+T

GATGAGTCCTGAGTAA NNNT

4N+A

GATGAGTCCTGAGTAA NNNNA

4N+C

GATGAGTCCTGAGTAA NNNNC

4N+G

GATGAGTCCTGAGTAA NNNNG

4N+T

GATGAGTCCTGAGTAA NNNNT

5N+A

GATGAGTCCTGAGTAA NNNNNA

5N+C

GATGAGTCCTGAGTAA NNNNNC

5N+G

GATGAGTCCTGAGTAA NNNNNG

5N+T

GATGAGTCCTGAGTAA NNNNNT

For determination of the fourth selective nucleotide For determination of the fifth selective nucleotide

For determination of the sixth selective nucleotide

To excise the 33P-labelled AFLP fragment out of an acrylamide gel, an AFLP fingerprint was generated using an EcoRI+3 or PstI+2 in combination with the, by AFLP mediated mini-sequencing identified, MseI+6 primer. The polyacrylamide gels, dried on Whatmann 3MM paper, were overlaid with autoradiogram images. The pieces of gel/paper were transferred to 200 µl of TE and incubated for 1h. Five micro litres of supernatant was used to re-amplify the fragment, using a PCR in which the EcoRI+0 or PstI+0 in combination with MseI+0 were used as primers. In total, 200 ng of the re-amplified AFLP fragment was used for direct sequencing using the appropriate AFLP+0 primer as sequencing primer (BaseClear, Leiden, The Netherlands). As an alternative for radioactivity, samples generated by using Fluorescently (IRDye 700) labelled EcoRI+3 or PstI+2 primers combined with fragment-specific MseI+6 primers were analysed on a NEN® Global Edition IR2 DNA Analyzer (LI-COR® Biosciences, Lincoln, NE). After separation, the polyacrylamide gel was scanned on a LI-COR® Biosciences Odyssey® Infrared Imaging System along with a grid pattern to allow careful positioning of bands. Gel plugs containing fragments were excised using a scalpel and successful fragment extraction was verified by re-scanning the gel (Kovar et al., 2002). After excision, gel plugs

83

Chapter 5

were placed in 15 µl of 1x TE and frozen at -80°C for ~30 min, followed by one thawingrefreezing step at -20°C. After thawing, samples were centrifuged for 15 min at 15 000 g and 4 µl was taken for PCR re-amplification using EcoRI+0 or PstI+0 in combination with MseI+0 primers. Fragments were sequenced directly using the same primers as used for re-amplification on a NEN® Global Edition IR2 DNA Analyzer using IRDye 800 v2 Acycloterminators™. The DNA sequence of the excised AFLP band was used to design locus-specific primers. The amplification product obtained with such primers was screened for internal polymorphisms with restriction enzymes listed in Table 2. After restriction, the fragments were separated on a 3% agarose gel including ethidiumbromide. Table 2: Set of relatively cheap frequent cutting restriction enzymes for cost effective detection of internal polymorphisms that can be used as a CAPS marker

Restrictionenzyme AciI AluI ApoI BfaI BsaJI BssKI BstUI DdeI DpnI HaeIII HhaI

HinfI HpaII Hpy188I HpyCH4III HpyCH4IV MnlI MwoI NlaIII NlaIV RsaI Sau96I TaqI Tsp509I

DNA adjacent to the AFLP fragment was obtained by anchor PCR using the Genome Walker Kit (Clontech, Palo Alto, CA) based on the method of Siebert et al. (30). Ten restriction enzymes (AatI, AluI, DpnI, DraI, HincII, PvuII, SmaI, XmnI, AseI, MseI) were individually used for restriction and ligation of the Genome Walker adaptors. PCRs with an internal primer (based on the sequence of the AFLP fragment) in combination with the Genome Walker adapter primer were performed according to the standard AFLP protocol on 10 times diluted restriction-ligation mix. Nested PCRs using a second internal primer (based upon the sequence of the AFLP fragment) and a second adapter primer were performed on 100 times diluted amplification product of the first PCR. Five microlitres of the final PCR product was checked on a 3% agarose gel to verify that a unique fragment was obtained. The remaining 45 µl of PCR product was used for sequencing

84

Conversion of AFLP markers into simple single locus markers

(BaseClear). The Flanking sequence information was used to design a primer that amplified a fragment in combination with the internal primer of the AFLP fragment. The SNP that originally caused the AFLP was included in this fragment and used for the development of a CAPS or dCAPS. Single locus PCRs for CAPS or dCAPS were performed using 5 µl of DNA (10 ng/µl), 0.6 µl of each primer (50 ng/µl), 0.8 µl of dNTPs (5mM) and 0.08 µl of Taq polymerase (5 U/µl) in a total volume of 20 µl. Results and discussion A Flowchart for the procedure to convert AFLP markers into simple single locus PCR assays, or allele-specific PCR markers, is shown in Figure 1. The protocol is comprised of the following steps: (step 1) AFLP-mediated mini-sequencing, (step 2) re-amplification of the AFLP fragment in a less complex fingerprint and excision of the AFLP fragment, (step 3) direct sequencing of the excised and re-amplified AFLP fragment, (step 4) design of internal locus-specific primers, (step 5) screening for additional internal polymorphic sites, (step 6) identification of the SNP that originally caused the AFLP, (step 7) identification of Flanking DNA and (step 8) exploitation of the SNP that caused the AFLP in a CAPS or dCAPS marker. Table 3: Ten selected AFLP-marker conversion with their extra selective nucleotide for the MseI-primer, the restriction enzymes that provided a CAPS with the internal fragment and the site in which a SNP was present that caused the AFLP.

fragment and fragment length (bp)

fourth, fifth and sixth selective nucleotide for the MseI-primer

CAPS enzyme of internal fragment

Place of SNP that caused the AFLP and the exact SNP for the three elongated fragments

E-ACT/M-CAG-287

GTC

*

EcoR I restriction-site GAA(G/T)TC

E-AGA/M-CAG-188

AAG

MnlI

MseI restriction-site

E-AGA/M-CCT-131

AAA

*

EcoR I restriction-site GAATT(C/T)

E-ATC/M-CAC-251

AAA

ApoI/MnlI

MseI selective nucl.

E-ATG/M-CTT-239

TTT

HinfI

MseI restriction-site

P-AC/M-ATA-320

AAT

Sau96I/NlaIV

PstI restriction-site

P-AG/M-ATG-359

AAG

NlaIII

PstI selective nucl.

P-AT/M-AGC-326

CAA

NlaIII/MnlI

PstI selective nucl.

P-TG/M-AGA-198

CCA

*

MseI selective nucl. (A/T)GA

P-TG/M-AGT-321

GTA

HpyCH4IV

PstI restriction-site

* = no restriction enzyme from Table 2 provided a CAPS marker

85

Chapter 5

In total, 10 randomly chosen AFLP markers (Table 3), containing relatively small (131 bp) and large (359 bp) fragments were converted into simple PCR markers to demonstrate the universal applicability of the procedure.

Figure 1: Overview of the steps of the protocol to convert any AFLP marker into a single locus PCR-based marker assay (our flowchart).

86

Conversion of AFLP markers into simple single locus markers

AFLP-mediated mini-sequencing The first step of the protocol aims for the determination of the fourth, fifth and sixth selective nucleotides adjacent to the MseI primer of which the first three selective nucleotides are known. A generalised set of 12 degenerated primers was used (Table 1) to analyse the adjacent nucleotides for all our fragments. A typical image obtained by the mini-sequencing primers is shown in Figure 2. For all 10 selected AFLP fragments, the next three selective nucleotides could be determined unambiguously and are listed in Table 3. Remarkably, the total amount of degenerated primer needed for this technique appeared not very critical and may vary between 3 and 10 pmol per 10 µl of PCR volume. It is however more important that the EcoRI+3/MseI+3 or PstI+2/MseI+3 pre-amplified product mixture, used as template, is sufficiently diluted (>100 times) in order to avoid that unused MseI+3 primer from previous PCRs disturbs the reaction and the fingerprints resemble the original +3/+3 amplification pattern (data not shown).

Figure 2: Products obtained after PCR using the twelve degenerated MseI primers (listed in Table 3) on 100 times diluted EcoRI+3/MseI+3 template loaded on polyacrylamide gel. The first four lanes allow to determine the fourth selective nucleotide (fig 2-I), the second four lanes allow to determine the fifth selective nucleotide (fig 2-II) and the third four lanes the sixth selective nucleotide (fig 2-III). As a control the EcoRI+3/MseI+3 PCR product from both parents was loaded on gel (fig 2-IV). e.g. fourth, fifth and sixth selective nucleotide for fragment 1 are C-A-A and for fragment 2 are T-G-C.

Re-amplification of the AFLP fragment in a less complex fingerprint and excision of the AFLP fragment Information on the fourth, fifth and sixth selective nucleotides allowed PCR amplification using MseI+6 primers. A new AFLP fingerprint of much lower complexity was generated using the MseI+6 primers using diluted +3/+3 AFLP amplification products as template. Two such fingerprints are shown in Figure 3. In rare cases, background products from the +3/+3 AFLP fingerprint appear as very clear bands in the +3/+6 AFLP fingerprint. These fragments did not influence the extraction of the desired fragment from gel. The intensity of the target AFLP fragment and a few other remaining AFLP fragments is much higher and almost always without co migrating fragments compared to a standard (+3/+3) AFLP. This increases the probability of successful excision of the desired fragments from dried gels.

87

Chapter 5

DNA that dissolved from the pieces of gel/paper into TE buffer was re-amplified with AFLP primers without selective nucleotides using 5 µl from this solution. For all 10 AFLP fragments large amounts of re-amplified DNA could be obtained.

Figure 3: Reduction of the AFLP fingerprint complexity by application of the “MseI+6-primer” to allow the isolation of an AFLP band by excision from gel without the co-isolation of contaminating DNA fragments. In this image the EcoRI+3/MseI+3 amplification products of both parental genotypes (part3-I) are compared with their respective EcoRI+3/MseI+6 amplification products (part 3-II and part 3-III).

The implementation of extra selective nucleotides is also possible at the rare cutter site to further reduce fingerprint complexity, for example, in very dense fingerprints or if two fragments with a very small size difference appear to have the same six selective nucleotides at the MseI site. For these special cases the fourth, fifth and sixth selective nucleotides for the EcoRI site (or the third, fourth and fifth for the PstI site) can be determined by using a set of degenerate EcoRI (or PstI) primers and labelled MseI primers. In case fluorescently (e.g. IRDye 700) labelled MseI primers need to be ordered for this latter step, it is interesting to note that MseI+2 primers work equally well as labelled MseI+3 primers (results not shown). This will considerably reduce the ordering costs of labelled primers when AFLP fragments of different primer combinations need to be converted. The fragment can successively be excised from a fingerprint generated with a MseI+6 and a EcoRI+6 or PstI+5 primer. Direct sequencing of the excised and re-amplified AFLP fragment DNA of the excised and re-amplified AFLP fragment was sequenced from both ends using the corresponding core primers without selective nucleotides as sequencing primer. Usually, DNA fragments are cloned into Escherichia coli, confirmed by PCR or restriction

88

Conversion of AFLP markers into simple single locus markers

analysis and sequenced (Bradeen and Simon, 1998; Shan et al., 1999; Reamon-Bϋttner and Jung, 2000). However, the drawback of this method is that other non-specific comigrating DNA fragments may be cloned and sequenced. When the number of sequenced clones is low or the cloning efficiency of a particular AFLP fragment is low, co-isolated fragments may outnumber the sequence of the correct fragment and hamper the determination of the right sequence (Meksem et al., 2001). In contrast, direct sequencing of a fragment allows the determination of the level of putative impurities in the PCR product. This can be inferred from the trace-file (peak pattern) from the DNA sequencer, where clear peaks should be prominent against little background. Out of the 10 double-stranded sequences that were obtained, eight had no ambiguities and two had

Suggest Documents