In Europe, rice (Oryza sativa L.) has a relatively recent history

RESEARCH Genetic Diversity and Population Structure in a European Collection of Rice Brigitte Courtois,* Julien Frouin, Raffaella Greco, Gianluca Bru...
Author: Mavis Flynn
10 downloads 0 Views 1MB Size
RESEARCH

Genetic Diversity and Population Structure in a European Collection of Rice Brigitte Courtois,* Julien Frouin, Raffaella Greco, Gianluca Bruschi, Gaëtan Droc, Chantal Hamelin, Manuel Ruiz, Guy Clément, Jean-Charles Evrard, Sylvie van Coppenole, Dimitrios Katsantonis, Margarida Oliveira, Sónia Negrão, Celina Matos, Stefano Cavigiolo, Elisabetta Lupotto, Pietro Piffanelli, and Nourollah Ahmadi

ABSTRACT In southern Europe, rice (Oryza sativa L.) is grown as an irrigated crop in river deltas where it plays an important role in the agroecological equilibrium through soil desalinization. In these regions, rice is at the northern limit of its natural area of adaptation. Special cultivars are needed for these challenging conditions. Using modelbased and distance-based approaches, we analyzed the genetic structure of the European Rice Germplasm Collection (ERGC), which is composed of 425 accessions, using 25 simple sequence repeat (SSR) markers. We compared it with a reference set of 50 accessions that are representative of the diversity of O. sativa. Most of the ERGC accessions (89%) clustered with japonica types. The ERGC japonica accessions were classified into three groups: one group close to rice types of tropical origin that are found in the United States and Argentina and two groups of temperate origin showing less differentiation. The three japonica groups could be characterized according to their grain type and maturity class, which are the most strongly selected traits in European breeding programs. We extracted a core collection of 250 japonica accessions and characterized it using 70 single nucleotide polymorphisms (SNPs). The SSR and SNP dissimilarity matrices coincided reasonably well and for the best-supported structure, the percentages of admixture were highly correlated. The core collection can be used as an association panel to search for alleles of interest for temperate areas or as a training population for genomic selection.

B. Courtois, J. Frouin, G. Droc, C. Hamelin, M. Ruiz, G. Clément, J.-C. Evrard, S. van Coppenole, and N. Ahmadi, Cirad, UMR AGAP, 34098 Montpellier, France; R. Greco and P. Piffanelli, Parco Tecnologico Padano, 26900 Lodi, Italy; G. Bruschi, S. Cavigiolo, and E. Lupotto, CRA-RIS, 13100 Vercelli, Italy; D. Katsantonis, NAGREF, 57001, Thermi-Thessaloniki, Greece; M. Oliveira and S. Negrão, ITQB and IBET, Av. da República, 2780-157 Oeiras, Portugal; C. Matos, INIAS, 2784-505 Oeiras, Portugal. Received 15 Jan. 2012. *Corresponding author ([email protected]) Abbreviations: AMOVA, analysis of molecular variance; CIRAD, Centre de Coopération Internationale en Recherche Agronomique pour le Développement; CRB-T, Tropical Biological Resource Center of Montpellier; ERGC, European Rice Germplasm Collection; EURIGEN, genotyping for the conservation and valorization of European rice germplasm; FST, Wright’s fi xation index; GB, gene bank; NJ; neighbor-joining; LD, linkage disequilibrium; PCR, polymerase chain reaction; PIC, polymorphism information content; QTL, quantitative trait loci; RESGEN, rice genetic resources for Europe; SNP, single nucleotide polymorphism; SSR, simple sequence repeat.

I

n Europe, rice (Oryza sativa L.) has a relatively recent history. This crop was probably introduced into Greece by returning members of expeditions to India as early as 340 to 320 BCE and spread gradually to neighboring countries, notably to Sicily. Later, travelers from Arab regions and Spain brought rice to Portugal and to mainland Italy, but it was only in the 15th century that it became an established crop in the region (IRRI, 2002; Spada et al., 2004). Rice cultivation presently occupies approximately 460,000 ha in European Union countries with a total production of 2.9 million t being reported in 2009 (http://faostat.fao.org [accessed 3 Dec. 2011]). The main producers are Italy (238,500 Published in Crop Sci. 52:1663–1675 (2012). doi: 10.2135/cropsci2011.11.0588 © Crop Science Society of America | 5585 Guilford Rd., Madison, WI 53711 USA All rights reserved. No part of this periodical may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Permission for printing and for reprinting the material contained herein has been obtained by the publisher.

CROP SCIENCE, VOL. 52, JULY– AUGUST 2012

1663

ha) and Spain (119,300 ha) and, to a lesser extent, Greece (29,000 ha), Portugal (27,900 ha), France (24,200 ha), Bulgaria (13,300 ha), Romania (8300 ha), and Hungary (2700 ha), with an average yield in the region of 6.6 t ha−1 being obtained. Although total and per capita rice consumption are limited in Europe (4.8 kg per capita yr−1), these quantities are steadily increasing and rice quality is diversifying. All of the rice growing areas in Europe are irrigated. Rice is grown mostly in river beds or deltas (the Pô River in Italy, the Ebro in Spain, the Axios in Greece, the Mondego, Tejo, Sorraia, and Sado in Portugal, and the Rhône in France), which are ecologically fragile areas where it plays an important environmental role by limiting problems associated with soil salinity via flooding of fields. Because rice in Europe is at the northern limit of its natural cultivation area, it suffers from the short growing season (May to September), the long days during most of the growing season, and the low temperatures at both extremities of the cycle. Well-adapted rice cultivars are needed to overcome these natural constraints. The Asian cultivated species Oryza sativa L. has a clear genetic structure including two major subgroups, indica and japonica, which probably originated from two independent domestication events, and two minor subgroups, aus–boro and sadri–basmati, the origins of which are unclear (Glaszmann, 1987; Garris et al., 2005; Zhao et al., 2010). These genetic groups are characterized by specific agroecological adaptations. The indica genotypes are tropical rice cultivars that are grown in lowland conditions whereas the japonica genoytpes can be either tropical cultivars that are adapted to rainfed upland conditions or temperate cultivars that are adapted to lowland conditions. The cultivars grown in Europe are primarily but not exclusively temperate japonica types. In the framework of two European projects (RESGEN [rice genetic resources for Europe], 1999–2002, and EURIGEN [genotyping for the conservation and valorization of European rice germplasm], 2006–2010), breeders in public breeding institutions decided to pool their working collections and to characterize the resulting European Rice Germplasm Collection (ERGC), which contains approximately 450 accessions, on phenotypic and molecular bases. Several of the entries in the collection originate from Asia or from other temperate areas such as the Americas. However, most of the collection is composed of cultivars that were created by European breeding programs. These cultivars are original genetic combinations that are potentially useful for the Mediterranean and other temperate areas. European breeding programs target a short growth duration, salinity and cold tolerance at critical stages, blast resistance, and adaptation to water saving strategies as the main breeding criteria. In addition to adaptation to local agroecological conditions, rice cultivars need 1664

to exhibit grain characteristics that match the demand of the European market. Rice grains are classified into four categories based on grain length and on their lengthto-width ratio (round, medium, long A, and long B). Consumer demand in Europe is increasingly orientated toward exotic grain types (e.g., extra-long grain, aromatic types) that compete with local types and for which Europe is a net importer. Processors require high milling yields. Our understanding of the genetic bases of important agronomic traits in rice has increased considerably through the use of mapping populations for quantitative trait loci (QTL) detection (reviewed by Xu, 2002, and Yamamoto et al., 2009; more than 8600 rice QTL are listed in the Gramene database [http://www.gramene.org/qtl]). However, association studies provide a better resolution without the need to develop mapping populations and are increasingly used for gene and allele discovery (Zhu et al., 2008; Huang et al., 2010, 2011; Zhao et al., 2011). The ERGC provided a good starting point to defi ne a rice core collection that could be useful for Europe in association mapping studies and in defi ning marker-assisted breeding strategies. This type of collection can also be useful for genomic selection (Meuwissen et al., 2001; Heff ner et al., 2009), in which the calibration step uses a training population that is representative of the material handled in breeding programs. Similar attempts were recently undertaken to defi ne core collections from national collections, such as those of Japan (Ebana et al., 2008) and China ( Jin et al., 2010; Zhang et al., 2011a, b), or from international collections, such as the USDA collection in the United States (Yan et al., 2007; Agrama et al., 2010, 2011; Ali et al., 2011). The size of a core collection will differ depending on whether the goal of future users is strictly germplasm conservation or is more focused on usage for allele mining ( Jackson et al., 1999). Our aim was primarily to develop a collection that would be useful for genetic and breeding studies while minimizing the presence of closely related entries. This goal justified the maintenance of a large collection. The selection of accessions that makes up the core collection is critical and requires a good understanding of the organization of the genetic diversity of the collection. This knowledge is required for all modern breeding activities, especially to achieve the best management and exploitation of genetic resources and to guide breeding plans. This knowledge is particularly important when the genetic architecture of traits is studied using association mapping, in which the population structure has to be taken into account to limit the number of false positive associations (Pritchard et al., 2000b). Because little reliable information is available regarding the pedigrees of the European rice accessions, molecular markers are the tool of choice to analyze their population structure. Simple sequence repeat (SSR) markers have

WWW.CROPS.ORG

CROP SCIENCE, VOL. 52, JULY– AUGUST 2012

been widely used in germplasm evaluation because of their high level of polymorphism, which enables a finely detailed view of the relationships among accessions to be obtained with a small number of markers (McCouch et al., 1997). Subsets of temperate japonica rice cultivars from Italy (Spada et al., 2004; Mantegazza et al., 2008; FaivreRampant et al., 2011), Portugal ( Jayamani et al., 2007), Argentina (Giarrocco et al., 2007), Japan (Ebana et al., 2008), and the United States (Lu et al., 2005) have been characterized using SSR markers but no study has yet encompassed the overall European diversity. With the sequencing of several rice genomes and the possibility of revealing single nucleotide polymorphism (SNP) markers en masse, SNPs are gaining importance in diversity studies (McNally et al., 2009; Zhao et al., 2011). The primary advantages of these markers are that they occur in genomes at a much higher frequency than SSRs, with close to one SNP being observed per 140 bp in rice (IRGSP, 2005), and that they can be genotyped in high throughput systems with a high multiplex ratio. The drawback of SNPs is that they are mostly biallelic and, therefore, individually less discriminant. The polymorphisms of SSR and SNP are generated via different mechanisms (replication slippage for SSRs vs. point mutation for SNPs) and the two marker types can therefore provide different views of the structure of a given population. It is worthwhile to compare these patterns based on real data as has been performed in maize (Zea mays L.), using both historical and commercial material (Hamblin et al., 2007; Van Inghelandt et al., 2010), as well as in peanut (Arachis hypogaea L.) (Varshney et al., 2010). The objectives of this study were to characterize the ERGC accessions using SSR and SNP markers, to analyze the extent and structure of their diversity, to compare the results obtained using the two types of markers, and to extract a representative collection for future genetic and genomic studies, particularly for association mapping purposes.

MATERIALS AND METHODS Material The 425 accessions of the ERGC that were genotyped in this study are listed in Supplemental Table S1. These accessions originated from 27 countries, with 68% arising from European countries (primarily Italy, Spain, Greece, Portugal, and France as well as several cultivars from Bulgaria, Hungary, Romania, Turkey, and Russia). A set of 50 additional accessions (minicore) of O. sativa that are representative of the four main rice varietal groups (indica, japonica, aus–boro, and sadri–basmati) and that were randomly extracted from a worldwide core collection of 288 accessions developed at the International Rice Research Institute (IRRI) and known as mini-GB (mini gene bank) (Glaszmann, 1987; Glaszmann et al., 1995) was genotyped at the same time (Supplemental Table S2). Mini-GB has been included in part or in full in several of the large-scale studies CROP SCIENCE, VOL. 52, JULY– AUGUST 2012

conducted by IRRI (K. McNally, personal communication, 2007) and Cornell University (Garris et al., 2005; Zhao et al., 2010) to investigate rice diversity. Therefore, the minicore constitutes an appropriate reference set to position the ERGC accessions with respect to species diversity. The minicore included 19 indica, 19 japonica (18 tropical and 1 temperate), eight aus–boro, and four sadri–basmati accessions. An accession of Oryza glaberrima Steud. (IRGC [International Rice Germplasm Center] 104041) was included as an outgroup to root the resultant trees. Seeds of all the accessions were obtained from the Tropical Biological Resource Center of Montpellier (CRB-T) (Centre de Coopération Internationale en Recherche Agronomique pour le Développement [CIRAD]), France (http://golo.cirad.fr/). The purity of the accessions kept by the CRB-T is maintained using the panicle row system during multiplication and its seed laboratory is ISO9001 certified (NF S96-900 [AFNOR, 2011]).

Phenotypic Data Phenotypic data were derived from experiments conducted in the framework of the RESGEN (Feyt et al., 2001) and EURIGEN projects and are available the EURIGEN database (http:// eurigendb.cirad.fr). The data for each accession are presented in Supplemental Table S1. The ERGC accessions were classified according to their maturity class (early, medium, medium late, or late) and grain type. In Europe, the grain format is defined for the grain without its outer hull (i.e., the cargo grain) as round (cargo grain length < 5.2 mm), medium (5.2 mm ≤ cargo grain length < 6.0 mm), long A (cargo grain length ≥ 6.0 mm and cargo grain length-to-width ratio ≥ 3.0), or long B (cargo grain length ≥ 6.0 mm and cargo grain length-to-width ratio < 3.0).

Genotyping The DNA from one plant per accession was extracted using an automated Tecan Freedom Evo 150 workstation (Tecan Group Ltd.) equipped with a liquid handler and the Promega-Wizard Magnetic 96 DNA purification kit (Promega Corporation) at the Rice Genomics Unit of Parco Tecnologico Padano, Lodi, Italy. A set of 25 SSRs distributed among the 12 rice chromosomes ( Table 1) was genotyped in the 475 accessions. These markers were chosen from the 48 markers used by the Generation Challenge Program (http://www.generationcp.org/) to genotype 3000 accessions to enable comparison across data sets. Genotyping was performed according to the protocol of Roy et al. (1996) and implemented with the automated infrared fluorescence technology of LICOR 3200 sequencers (LICOR Biosciences) using the genotyping and robotics platform at CIRAD, Montpellier, France. The primer sequences were retrieved from the Gramene database (http://www.gramene. org/markers). For a given SSR locus, the forward primer was designed with a 5′-end M13 tail (5′-CACGACGTTGTAAAACGAC-3′). The polymerase chain reaction (PCR) amplifications were performed in an Eppendorf Mastercycler using 10 ng of DNA in a 10 μL fi nal volume containing buffer (10 mM Tris-HCl [pH 8], 100 mM KCl, 0.05% w/v gelatin, and 2.0 mM MgCl 2), 0.1 μM of the M13-tailed primer, 0.1 μM of the other primer, 160 μM deoxyribonucleotide triphosphates (dNTPs), 1 U of Taq DNA polymerase, and 0.1 μM of

WWW.CROPS.ORG

1665

Table 1. Characteristics of the 25 simple sequence repeat (SSR) loci including their repeat motif, number of alleles per locus, and polymorphism information content (PIC) in three sets of accessions (mini-core, European Rice Germplasm Collection [ERGC] accessions, and japonica accessions of the ERGC) Mini-core Marker RM1 RM5 RM237 RM431 RM154 RM452 RM338 RM514 RM14643 RM124 RM307 RM538 RM510 RM11 RM25 RM44 RM447 RM215 RM316 RM271 RM474 RM484 RM287 RM19 RM1227 Total Average

Chr.†

SSR motif

No. acc.‡

No. of alleles

1 1 1 1 2 2 3 3 3 4 4 5 6 7 8 8 8 9 9 10 10 10 11 12 12

(GA)26 (GA)14 (CT)18 (AG)16 (GA)21 (GTC)9 (CTT)6 (AC)12 (GA)21 (TC)10 Complex (GA)14 (GA)15 (GA)17 (GA)18 (GA)16 (CTT)8 (CT)16 Complex (GA)15 (AT)13 (AT)9 (GA)21 (ATC)10 (AG)15

50 50 50 50 50 50 50 50 50 50 49 50 50 50 50 50 50 50 50 49 47 50 50 50 50

11 10 8 9 13 5 3 7 7 3 9 10 7 9 7 10 7 7 7 7 14 3 9 9 7 198 7.9



Chr., chromosome.



No. acc., number of accessions assayed.

ERGC accessions PIC

No. acc.

No. of alleles

0.86 0.84 0.71 0.71 0.90 0.57 0.43 0.66 0.59 0.55 0.75 0.73 0.65 0.84 0.66 0.79 0.65 0.74 0.73 0.71 0.84 0.40 0.71 0.79 0.76

415 422 425 423 423 421 416 425 425 424 424 418 425 425 419 425 425 424 416 423 420 420 425 425 424

10 9 10 6 11 4 3 10 7 3 8 6 6 9 10 13 6 6 6 8 10 4 9 7 8 189 7.6

0.70

M13 primer-fluorescent dye IR700 or IR800 (Eurofi ns MWG Operon). The PCR program included an initial denaturation step at 95°C for 4 min and then 35 cycles at 94°C for 1 min, melting temperature for 1 min, and 72°C for 1 min, and a fi nal elongation step at 72°C for 8 min. The obtained IR700- or IR800-labeled PCR products were diluted sevenfold or fivefold, respectively, subjected to electrophoresis in a 6.5% polyacrylamide gel, and sized by the infrared fluorescence scanning system of the sequencer. Allele calling was performed twice by two different operators based on five DNA pools of known allele sizes that were included in each gel and used as standards. A set of 90 SNPs was extracted from the Oryza SNP database (http://www.oryzasnp.org/cgi-bin/gbrowse/osa_snp_ irgsp) based on (i) their polymorphism among the seven tropical and temperate japonica accessions and (ii) a minimum distance of 0.4 Mb between SNPs to limit the risks of linkage disequilibrium between the markers (Supplemental Table S3). The selected SNPs were genotyped at the Rice Genomics Unit of Parco Tecnologico Padano in Lodi, Italy, using an Illumina Veracode assay (Illumina, Inc.) in an association panel of 250 ERGC accessions that were selected from the total of 425 (see below). Appropriate controls (duplicate accessions and artificial heterozygotes) were included in the plates. Allele calling was performed with BeadStudio software (Illumina Inc., 2005) and was manually checked

1666

Japonica accessions of the ERGC

PIC

No. acc.

No. of alleles

0.71 0.47 0.57 0.42 0.73 0.21 0.27 0.36 0.16 0.20 0.50 0.47 0.46 0.56 0.44 0.78 0.59 0.64 0.66 0.28 0.75 0.44 0.52 0.38 0.75

374 378 380 378 378 376 373 380 380 379 379 373 381 381 374 380 381 379 371 378 378 375 380 380 379

8 8 8 4 10 3 3 7 5 3 6 3 5 7 8 13 5 5 6 6 8 3 8 3 8 153 6.1

0.49

PIC 0.66 0.46 0.50 0.33 0.71 0.07 0.27 0.29 0.05 0.04 0.41 0.39 0.38 0.47 0.33 0.75 0.52 0.59 0.58 0.12 0.70 0.35 0.42 0.28 0.70 0.42

for each SNP. The data were then coded as modalities and the rare heterozygotes were replaced by missing data.

Data Analyses For the different sets of accessions, the allele number, allele frequencies, and polymorphism information content (PIC) of each marker were computed using PowerMarker version 3.25 (Liu and Muse, 2005). We used both a model-based and a distance-based approach to assess the genetic structure. For the model-based approach, the number of subpopulations, K, in the different accession sets was fi rst assessed using Structure v2.3 software (Pritchard et al., 2000a). For all sets, the program was run with the following parameters: haploid data (which is more appropriate for a highly autogamous species of known phase that seldom reaches Hardy-Weinberg equilibrium), the possibility of admixture, and correlated allelic frequencies. Each run consisted of a burn-in of 150,000 steps followed by 150,000 iterations. We performed 10 runs per K value, with K varying from 1 to 8. An accession was discretely classified into a subpopulation when more than 80% of its composition came from that population. Otherwise, it was classified as admixed. To determine the most probable K value, we plotted the mean estimate across runs of the log posterior probability of the data for a given value of K, designated L(K), and chose the K value

WWW.CROPS.ORG

CROP SCIENCE, VOL. 52, JULY– AUGUST 2012

at which the distribution of L(K) leveled off or continued to increase but much more slowly than before. We also used ΔK, an ad hoc quantity proposed by Evanno et al. (2005) that is related to the second order rates of change of the likelihood function with respect to K. This quantity has been shown to be a good predictor of the real number of subpopulations based on simulations. Even with this criterion, determination of the number of subpopulations is notoriously difficult. Therefore, we examined the evolution of population structure at increasing values of K, up to the level determined by ΔK, rather than only focusing on the fi nal value. The hierarchical distribution of the molecular variance within and between subpopulations defi ned by Structure was assessed via analysis of molecular variance (AMOVA) using Arlequin (Excoffier et al., 2006). The pairwise Wright’s fi xation index (FST) values, which evaluate the genetic differentiation between these populations (Wright, 1978), were computed using the same software with 1000 permutations to determine their significance. To apply a distancebased approach, a dissimilarity matrix was computed using a shared allele index with DARwin software (Perrier and Jacquemoud-Collet, 2006). An unweighted neighbor-joining (NJ) tree was constructed based on this dissimilarity matrix. The O. glaberrima outgroup was grafted onto the tree to root it but did not contribute to the construction of the tree. To assess the robustness of the dissimilarity estimates based on 25 SSR markers, we followed a method that has been used in maize by Pejic et al. (1998) and Van Ingelhandt et al. (2010) and in rice by Zhang et al. (2011b). We performed 100 bootstrap replications for the markers, calculated the coefficient of variation (CV) of the dissimilarity for each pair of accessions across the 100 repetitions, and determined the mean CV across all of the pairs. All of the analyses were successively run for the entire accession set (mini-core plus ERGC) and for the japonica accession subset of the ERGC. We used the classification tree tools (Breiman et al., 1984) in the Excel add-in XLSTAT (Addinsoft, 2011) to determine which markers or marker combinations were the most discriminating within a given set of accessions. A representative collection of 250 accessions termed the core collection was extracted from the ERGC based on the SSR genotypes and using the maximum length subtree procedure available under DARwin (Perrier and Jacquemoud-Collet, 2006). This method prunes the tree of its most redundant units, therefore minimizing the risk of spurious associations due to structure in association studies while limiting possible reductions of allelic diversity (Perrier and Jacquemoud-Collet, 2006). This procedure enabled us to construct the core collection based on allelic combinations rather than on simple allelic richness. The number 250 was chosen as a reasonable compromise between the need for power in genetic studies and the difficulty involved in accurately phenotyping large populations. The loss of diversity in the core collection compared to the initial collection was estimated by the software in terms of both allele number and allele frequency. To assess the extent to which the structure of the core collection was determined by the type of marker used, we successively compared the results obtained using the distance-based approach with the results of the model-based approach applied to the SSR and SNP datasets. For the distance-based approach, both datasets were coded as CROP SCIENCE, VOL. 52, JULY– AUGUST 2012

modalities, and the dissimilarity matrices were computed using a Sokal and Michener index with DARwin software. The SSR and SNP dissimilarity matrices were compared using the Mantel’s test as proposed in PowerMarker (Liu and Muse, 2005). For the model-based approach, we ran Structure (Pritchard et al., 2000a) on the two datasets. The percentages of admixture obtained for the most probable K values for both datasets were compared by computing the correlation coefficients.

RESULTS Diversity of Accessions in the European Rice Germplasm Collection Versus the Mini-Core All of the 25 SSRs assayed were polymorphic on the collection composed of the 425 ERGC and the 50 mini-core accessions, with an average of less than 0.7% missing data being observed. The heterozygosity rate was very low, at 0.5% (0.7% in the mini-core). This result is expected for a highly autogamous crop from accessions multiplied as pure lines and DNA extracted from a single plant. Heterozygotes were therefore replaced by missing data in subsequent analyses. The diversity parameters of the ERGC accessions compared with those of the mini-core are presented in Table 1. The total number of alleles revealed in the ERGC was 189 (198 in the mini-core with 148 in common). Among these alleles, 59% could be considered rare (frequency below 5%) vs. 40% in the mini-core. The number of alleles per marker varied from 3 to 13, with an average of 7.6 (7.9 in the mini-core). The PIC number varied from 0.16 to 0.78, with an average of 0.49 (0.70 in the mini-core). The decrease in PIC values in the ERGC compared with the mini-core applied to all of the markers except RM484 whereas the number of alleles remained more stable. These figures reflect the lower genetic diversity in the ERGC accessions. Structure (Pritchard et al., 2000a) was first run on all of the accessions combined (mini-core and ERGC). The estimated membership fractions of all of the accessions are presented in Fig. 1 for values of K ranging from 2 to 5. The mini-core accessions belonging to known enzymatic groups were used as references to characterize the subpopulations. For K = 2, a split between japonica (372 accessions, shown in blue) and non-japonica (68 accessions, shown in red) accessions was observed with 35 accessions admixed. For K = 3, the non-japonica group remained almost unchanged (66 accessions, shown in red) whereas the japonica group split into two subpopulations. The first subpopulation (71 accessions, shown in dark blue), hereafter referred to as American, which clustered with the tropical accessions of the mini-core, was composed of non-European accessions. The members of this subpopulation primarily consisted of American accessions from the southern belt of the United States (Texas, Arkansas, or Louisiana) and Argentina and several accessions derived from crosses involving these accessions.

WWW.CROPS.ORG

1667

Figure 1. Graph of estimated membership fraction for increasing values of K obtained using the 25 simple sequence repeat markers. For each value of K, the run with the highest likelihood was chosen. The first part of the graph represents the mini-core with a vertical bar separating enzymatic groups. The second part of the graph includes the European Rice Germplasm Collection (ERGC) (E) accessions. The different colors correspond to the different groups of structure enzymatic group. “1” represents indica accessions, “2” represents aus–boro, “5” represents sadri–basmati, and “6” represents japonica based on Glaszmann (1987).

The second subpopulation (249 accessions, shown in light blue), which clustered with the one temperate accession in the mini-core, included most of the European accessions, introductions from other temperate areas, and several U.S. accessions from California. The other 89 accessions were admixed. For K = 4, the aus–boro and sadri–basmati groups (20 accessions, shown in yellow) were separated from the indica group (47 accessions, shown in red). For K = 5, the most probable number of subpopulations based on Evanno’s criteria (Supplemental Fig. S1), two subpopulations, hereafter refereed to European1 (123 accessions, shown in light blue) and European2 (94 accessions, shown in light purple), were identified among the temperate japonica accessions whereas the other groups remained almost unchanged. A total of 131 accessions were classified as admixed. According to Evanno’s criteria, a larger number of subpopulations was unlikely. The final assignment corresponding to K = 5 is given in Supplemental Table S1. A distance-based approach was then applied. The CV of the dissimilarity estimates between all of the pairs of accessions (ERGC plus mini-core) was 12.4%, which 1668

indicates a reasonable robustness of the results obtained with the 25 SSRs. The positions of the accessions on the NJ tree are shown on Fig. 2. For finer analysis, Supplemental Fig. S2 presents the same results in hierarchical form together with the names of the accessions. Thirty accessions (symbol “I”) clustered with the indica accessions of the mini-core (symbol “1” in red), one (symbol “B”) with the aus–boro accessions (symbol “2” in yellow), and nine (symbol “S”) with the sadri–basmati accessions (symbol “5” in green). The non-japonica accessions were mostly introductions from South Asia (e.g., the basmati series) or East Asia (e.g., the Suweon and Myliang series, originating from Korea) that were, with few exceptions, brought to Europe as potential donors. Only three of these accessions were cultivars released from European breeding programs: ‘Artico’ and ‘Lampo’, which clustered with the indica accessions, were both selected in an introduction from Kenya, and ‘Fragrance’, which resulted from a japonica × basmati cross, clustered with the basmati accessions. The great majority of the ERGC accessions (381 out of 425) clustered with the japonica accessions of the mini-core

WWW.CROPS.ORG

CROP SCIENCE, VOL. 52, JULY– AUGUST 2012

Figure 2. Unweighted neighbor-joining tree representing the relative position of the European Rice Germplasm Collection accessions and the mini-core obtained using the 25 simple sequence repeat markers. Mini-core accessions are designated by their enzymatic group: “1” represents indica accessions (in red), “2” represents aus–boro (in yellow), “5” represents sadri–basmati (in green), and “6” represents japonica (in blue). European Rice Germplasm Collection accessions are designated by letters corresponding to the subpopulations defined by Structure (Pritchard et al., 2000a): “I” represents indica, “B” represents aus–boro and sadri–basmati, “A” represents American, “E1” represents European1, “E2” represents European2, and “m” represents admixed.

(symbol “6” in blue). The group of 18 tropical japonica cultivars belonging to the mini-core clustered tightly with 65 ERGC accessions whereas the only temperate accession in the mini-core, ‘Nipponbare’, was clearly located with a larger group of 316 ERGC accessions on another branch of the tree. Five accessions were intermediates between the groups, including several accessions that were clearly derived from japonica × indica crosses, such as ‘Italpatna’ × ‘Myliang 43’ and ‘Oscar’ × ‘Suweon 285’. The results of the model-based assignment of the ERGC and mini-core accessions into five subpopulations were projected onto the NJ tree (Fig. 2) and are represented by the colors in the dendrogram in Supplemental Fig. S2. The projection revealed good agreement between the clusters obtained with the two classification methods, with a difference in numbers being observed due to the admixed accessions, and slightly different resolutions. The separation of CROP SCIENCE, VOL. 52, JULY– AUGUST 2012

aus–boro from the sadri–basmati accessions was only visible on the NJ tree. Among the japonica accessions, the tropical group identified by Structure (Pritchard et al., 2000a) (symbol “A”) was easily distinguished using both methods. However, the two temperate subpopulations (symbols “E1” and “E2”) were not clearly separated on the NJ tree. A few accessions of the European1 subpopulation appeared in clusters of the European2 subpopulation and vice versa. A branch of the European2 subpopulation appeared among the branches of the European1 subpopulation. A hierarchical AMOVA revealed highly significant genetic differentiation among the five subpopulations. Approximately 37% of the variance was due to differences among subpopulations while 63% was due to differences within subpopulations. The pairwise FST values were all greater than 0.30, indicating high differentiation between subpopulations, except between the European1

WWW.CROPS.ORG

1669

and European2 subpopulations for which the pairwise FST was only 0.17. This low FST was consistent with the structure revealed by the Structure software (Pritchard et al., 2000a) and by the NJ tree.

Diversity of Japonica Accessions To improve the characterization of the japonica accessions, we performed a new round of analyses only on the japonica subgroup after removing the mini-core and ERGC accessions that clustered with the indica, aus, and basmati accessions and their intermediates. We used only the 381 accessions that clustered with the japonica group on the NJ tree, which represented 89% of the ERGC accessions. As expected, the diversity within japonica accessions was lower than that of the mini-core or the ERGC but was still high. The total number of alleles revealed decreased to 153 (81% of the ERGC total) in the analysis, among which 58% could be considered to be rare (frequency < 5%). The number of alleles per marker did not change considerably, varying from 3 to 13 with an average of 6.1 (Table 1). However, the PIC value varied from 0.04 to 0.75, with an average of 0.42. Several markers for which a low PIC had already been observed, such as RM124, RM14643, RM271, or RM452, exhibited a sharp decrease in their PIC values whereas the PIC values of other markers remained almost unchanged. No allele was strictly diagnostic of the japonica group but certain alleles were present in more than 95% of the japonica accessions and in less than 5% of the non-japonica accessions. The alleles that best distinguished japonica and non-japonica accessions were allele 290 of RM124, allele 107 of RM271, and allele 226 of RM452. When run for these 381 japonica accessions, Structure (Pritchard et al., 2000a) produced very similar results to those obtained for the japonica group in the initial set (minicore plus ERGC). For K = 2, the program distinguished American from European accessions, with 88 admixed accessions being observed. For K = 3, it separated the European1 and European2 subpopulations, with 161 admixed accessions of various types being detected. A larger number of subpopulations was unlikely. The assignments to the three japonica subpopulations, shown in Supplemental Table S1, were almost identical to those obtained for the whole collection with K = 5. The main difference resulted from a change in the classification of a small proportion of accessions (14%) that had been previously assigned to a subpopulation and that were now classified as admixed. The distance-based approach was also used for the japonica accessions. In this less diverse set, the CV of the dissimilarity estimates between all of the pairs of accessions reached 16.1%. As with the Structure (Pritchard et al., 2000a) results, the pattern of the NJ tree was very similar to that obtained with the whole set of accessions. In particular, the pattern exhibited a split between a clear 1670

cluster of tropical accessions and two clusters of temperate accessions with less distinct limits; additionally, a large set of accessions was located between these three subpopulations (data not shown). For the American, European1, and European2 subpopulations, the mean numbers of alleles per locus were 4.0, 2.7, and 3.4, with average PIC values of 0.36, 0.28, and 0.27, respectively. A hierarchical AMOVA conducted in these three subpopulations showed that 33 and 67% of the variance was due to differentiation among and within subpopulations, respectively. The pairwise FST confirmed that the American group was more distant from the European2 group (0.46) than from the European1 group (0.35) and that the European1 and European2 groups were close to each other (0.20). In this case, no simple allelic combination accurately discriminated the three subpopulations. The organization of the japonica accessions of the EGRC into three subpopulations could not be strictly associated with parameters such as geographic origin, breeding program, grain type, or maturity class, but several trends were detected, as shown in Fig. 3 and 4. The

Figure 3. Proportion of accessions with different grain types among the three subpopulations defined by Structure (Pritchard et al., 2000a). R, round; M, medium; A, long A; B, long B.

Figure 4. Proportion of accessions with different maturity classes among the three subpopulations defined by Structure (Pritchard et al., 2000a). E, early; M, medium; ML, medium late; L, late.

WWW.CROPS.ORG

CROP SCIENCE, VOL. 52, JULY– AUGUST 2012

Establishment of a Japonica Core Collection

Figure 5. Distribution of the similarity between pairs of accessions among the japonica accessions of the European Rice Germplasm Collection.

American subpopulation mostly grouped accessions from the southern belt of the United States, Argentina, and Italy, with the long B grain type from medium-late to late maturity classes. The European1 subpopulation grouped accessions from Italy, France, and Portugal, mostly of the long A grain type from the medium maturity class. The European2 subpopulation corresponded to round, medium or long A grain type from Spain, Portugal, eastern European countries, California, and Australia, primarily from the early and medium maturity classes. The admixed accessions were much more balanced between types. Correcting for genetic relatedness in addition to population structure has been shown to be important in association mapping to control for false positives (Yu et al., 2006). Simple genetic similarity matrices have been shown to effectively account for genetic relatedness while guaranteeing positive semidefiniteness (Kang et al., 2008). We analyzed the similarity matrix derived from the dissimilarity matrix to evaluate the degree of relatedness among ERGC japonica accessions. The similarity was surprisingly low for accessions that were largely derived from breeding programs that supposedly involved a relatively limited number of parents. The average similarity was 0.53 and only 0.4% of the pairs of accessions showed a similarity >0.9 (Fig. 5). Among the pairs of accessions with high similarity, it was observed that several included one member that had been derived from the other member through mutation (e.g., ‘Arborio’ and its mutant ‘Arborio precoce’ or ‘Maratelli’ and its mutant ‘Marathon’) or through selection (e.g., ‘Gange’ from ‘A301’). However, not all of the mutants or selections could be traced back to their original cultivars (e.g., ‘Maratelli’ and its mutant ‘M164’ occupied different positions on the tree with a similarity of 0.63) showing that either the pedigrees were not completely reliable or mislabeling had sometimes occurred.

CROP SCIENCE, VOL. 52, JULY– AUGUST 2012

To establish a japonica core collection, we extracted 250 accessions from the 381 ERGC japonica accessions using the maximum length subtree procedure (Supplemental Table S1). This core corresponded to a one-third reduction in the number of accessions, due in part to the retention of only one copy of very similar entries. By comparison, the number of alleles decreased from 153 to 141 (−8%), but the allele frequency proportions remained almost identical (−1%), indicating that the lost alleles were rare ones. The 250 accessions were well distributed on the tree branches as shown in Supplemental Fig. S3. The subset accurately represented the diversity embodied in the japonica accessions of the ERGC with a mean number of alleles per marker of 5.6 and a PIC value of 0.42 being observed, and with 64, 70, and 55% of the accessions belonging to the American, European1, and European2 groups, respectively. The same set of 250 accessions was genotyped using 90 SNP markers. Among these markers, only 70 were polymorphic and 22 showed a low frequency (below 5%) of the minor allele. The final number of SNP alleles (140) was in the same range as that of the SSR markers. The PIC values, which varied from 0.02 to 0.38, were strongly influenced by the markers with low polymorphism. The mean PIC value was 0.19 when the markers with low polymorphism were included in the computation vs. 0.26 when they were excluded. This PIC value was lower than found for the SSR markers but such result was expected because the maximum PIC value achievable by biallelic markers is 0.5. The dissimilarity matrices based on the SSR and SNP genotypic data were compared using a Mantel test. The correlation coefficient was 0.67 (p < 0.0001), showing that the matrices were closely related but showed some differences. One of the possible uses of the core collection is association mapping. In association studies, the factors used to control for false positives are the percentages of admixture for each accession. We ran Structure (Pritchard et al., 2000a) on the core collection to compare these percentages for the two types of markers. The most probable number of subpopulations was two for the SNP markers and three for the SSR markers; therefore, we compared the results for these two solutions. For K = 2, when the split between temperate and tropical japonica accessions occurred, we observed a very good correlation between the percentages of admixture for each accession obtained with the two types of markers (0.94; p < 000.1). For K = 3, the strength of the correlation depended on the subpopulation concerned. For the American subpopulation, the correlation with the percentage of admixture was almost as good as with K = 2 (0.91; p < 0.001). For the two European subpopulations, the correlations were significant but much weaker (0.54 and 0.59, respectively; p < 0.001). This result showed that

WWW.CROPS.ORG

1671

this structure was less solid and the subgroups were less well defined and, therefore, that the type of marker used influenced the estimates of admixture.

DISCUSSION The diversity of the 425 accessions of the ERGC was evaluated using 25 SSR markers. The results helped to define the position of these accessions in comparison with a reference set, clearly revealing the japonica origin of most European accessions (89%). The influence of indica cultivars in the European breeding programs is very limited, as only two of the accessions derived from European breeding programs were classified as indica. This difference primarily reflects the difficulty indica cultivars face when grown in Europe because of their lack of long-day adaptation and cold tolerance. The sterility barrier often observed in hybrids between the two subspecies, which is associated with a return of the progenies toward one of the parents (Oka, 1958; Harushima et al., 2002), may also play a role. These characteristics limit the possibility of using indica accessions to broaden the genetic basis or as sources of new yield component architectures or new resistance to biotic and abiotic stresses. Backcrossing to the recurrent japonica parent is generally necessary to recover suitable adaptation and fertility. Based on the limited pedigree data that is publicly available, the genetic basis of the ERGC was expected to be relatively narrow and current European cultivars were expected to be closely related. Molecular markers helped to demonstrate that high genetic diversity still exists among European japonica accessions that can be exploited in rice breeding programs. The mean PIC values reported for broader sets of accessions, genotyped with a greater number of SSRs and sampled using a bulk strategy, ranged from 0.33 to 0.37 and from 0.40 to 0.46 for temperate and tropical japonica accessions, respectively (Garris et al., 2005, Agrama et al., 2010). These PIC values are on the same order of magnitude as those obtained in the present study (global PIC of 0.42 for the japonica component of the ERGC and PICs of 0.31 and 0.37 for the temperate and tropical japonica components of the ERGC, respectively). By comparison, in Yunnan province in China, which is an area suggested to be a center of diversification for japonica rice cultivars, characterized by high genetic diversity, the mean PIC of japonica subpopulations reached 0.62 to 0.65 (Zhang et al., 2006). A small proportion of highly similar accessions derived through mutation or selection was detected in the ERGC but most of the germplasm showed moderate to high levels of dissimilarity. This study helped clarify the relationships between European germplasm and representatives of japonica subspecies worldwide. Three subpopulations were detected among the ERGC japonica accessions. Within this structure, several patterns were quite prominent (e.g., 1672

the differentiation between American and European accessions) while others were less so (e.g., the two temperate subpopulations with less distinct boundaries), as shown by the Structure (Pritchard et al., 2000a) patterns and the FST between groups. We were able to biologically characterize these groups to a certain extent, revealing the importance of two agro-morphological criteria (maturity class and grain type) as factors that accompany the structuring of ERGC genetic diversity. A short growth duration is needed to adapt to high latitudes and reduce the risks of cold stress during the sensitive stages of crop growth. This quality is particularly important for the eastern European cultivars, which are the earliest accessions in our collection, but this constraint is present throughout Europe, resulting in a clear-cut growth duration limit that can be circumvented only in exceptionally favorable years. Lateness may explain the limited influence of the tropical japonica group in European breeding programs except in Italy. This aspect is notably different from what is observed in U.S. breeding programs in the southern rice belt of the United States or in Argentina, which commonly used these tropical cultivars (Lu et al., 2005; Giarrocco et al., 2007). In comparison with the narrow suitable range of growth duration, the European market requirements for grain types are more diverse. In addition to the overall classical requirements for long grain types, locally popular dishes such as risotto in Italy or paella in Spain require round and medium grain types. These dishes represent niche markets that may explain the maintenance of a large group of traditional grain types. However, hybridizations that have been conducted in different European breeding programs have partially blurred the identity of each group, as demonstrated by the large proportion of admixed accessions representing intermediates between japonica groups. The combination of the composition of the collection and the method we chose to analyze the data certainly played a role in the type of structure detected among japonica accessions. Based on morphological traits and isozymes, temperate and tropical japonica accessions were shown to form a continuum following a latitude-based cline from Indonesia to Japan (Glaszmann and Arraudeau, 1986; Glaszmann, 1987). Historical sampling of the extremes of this distribution generally excluded Chinese accessions, which are difficult to access, leading to an impression of discontinuity. The ERGC collection may suffer from the same limitations, because East Asian cultivars are not well represented. In addition, the method used within Structure (Pritchard et al., 2000a) to analyze the genetic structure favors the differentiation of the population into discrete subpopulations that would perhaps not be as strongly supported by distance-based methods. Although the model-based approach is clearly justified at the species level because of the strong differentiation between indica and japonica, which correspond to two independent

WWW.CROPS.ORG

CROP SCIENCE, VOL. 52, JULY– AUGUST 2012

domestications (Kovach et al., 2007), it may be less well suited for analyses within the japonica group if this group is truly a continuum. The nature and choice of the markers also play a role in the structure detected as shown by groups that appeared clearly using SSR markers but less clearly using SNPs. In the present study, it was observed that the SSRs were clearly better suited to analyze the genetic diversity than the SNPs, because the genetic basis of the japonica accessions assayed was narrow and the SSRs used were part of a core set of markers known to be polymorphic and chosen to reveal diversity. In comparison, one fifth of the SNPs that were chosen at random and, therefore, not affected by the same ascertainment bias were monomorphic. These SNPs could, nevertheless, be affected by a bias of their own, as the Oryza SNP pool originated from only 20 cultivars and only 100 Mb of the rice genome (Zhao et al., 2010). For the loci that were polymorphic, although the total number of alleles was in the same range in both cases, the greater allelic frequency range of the SSRs conferred better discrimination power, as shown by the three subpopulations identified using SSRs vs. two using SNPs. This observation confirms the need to genotype additional SNPs or to select the most informative ones to obtain the same resolution as when using SSRs, which has previously been pointed out by several authors (Rosenberg et al., 2003, Hamblin et al., 2007). In the present study, the number of SNPs used was low. A larger number of SNPs may aid in obtaining convergence of the results. However, although differences between SNP- and SSRbased patterns were observed at the fi nest resolution, the high-level patterns obtained were globally similar. The limited number of SSR markers used may partially explain the weak differentiation detected between the two European groups. It was shown that the CV of the distance estimates decreased exponentially with an increasing number of markers up to a certain number of markers, above which the estimation precision stabilized (Pejic et al., 1998; Van Inghelandt et al., 2010; Zhang et al., 2011b). The threshold depends on the composition of the population. The CVs we obtained (12.4% with the whole set and 16.1% with the japonica set alone) were slightly higher than those obtained by Zhang et al. (2011b) for a comparable population of rice and number of markers but a different dissimilarity index. Based on the curve established by Zhang et al. (2011b), this result shows that we have not reached the precision plateau. However, the issue of marker number is more critical when the data reveal an absence of structure and the question arises of whether this absence is due to poor data quality or to a true lack of structure (Van Hintum, 2007). We used a two-step approach to establish a core collection adapted for association studies, first analyzing the structure of the ERGC and then basing our sampling on the CROP SCIENCE, VOL. 52, JULY– AUGUST 2012

structure identified. The method permitted elimination of very similar entries (sister lines or half-sibs), which are quite common in breeding programs. This approach was used by Breseghello and Sorrels (2006) to define an association panel and it enabled a successful association study to be performed with a reasonable population size. The core collection we defined is suitable for association analyses for traits for which the phenotypic distribution is balanced between subpopulations such as salinity tolerance, as shown by Ahmadi et al. (2011). However, it will probably be more difficult to use this collection to determine the genetic control of traits correlated with this structure, such as maturity class and grain shape. To determine the genetic bases of such traits, other approaches, such as classical mapping populations derived from crosses between representatives of subpopulations, could be more appropriate. Linkage disequilibrium (LD) in rice is said to extend from 50 kb in indica up to 400 kb in temperate japonica (Garris et al., 2003; Mather et al., 2007; McNally et al., 2009; Huang et al., 2010). We are in the process of genotyping our collection using SNP markers to achieve a density matching the average LD decay in temperate japonica and to determine the resolution limit we can expect in association studies with our core collection.

Supplemental Information Available Supplemental material is available at http://www.crops. org/publications/cs. Acknowledgments EURIGEN (“Genotyping for the Conservation and Valorization of European Rice Germplasm”) was a project funded by the European Commission- DG Agriculture and Rural Development within the AGRI GEN RES program for the conservation, characterization, collection, and utilization of genetic resources in agriculture. The authors thank Monique Deu from Cirad and Agostino Fricano from PTP for their useful comments on the manuscript.

References Addinsoft. 2011. XLSTAT version 2011.4.02. Addinsoft, Paris, France. Agrama, H.A., W. Yan, M. Jia, R. Fjellstrom, and A. McClung. 2010. Genetic structure associated with diversity and geographic distribution in the USDA rice world collection. Natural Sci. 2:247–291. doi:10.4236/ns.2010.24036 Agrama, H.A., W.G. Yan, F.N. Lee, R. Fjellstrom, M. Chen, M.H. Jia, and A.M. McClung. 2011. Genetic assessment of a mini-core developed from the USDA rice genebank. Crop Sci. 49:1136–1346. Ahmadi, N., S. Negrão, D. Katsantonis, J. Frouin, J. Ploux, P. Letourmy, G. Droc, B. Pedro, H. Trindade, G. Bruschi, R. Greco, M.M. Oliveira, P. Piffanelli, and B. Courtois. 2011. Targeted association analysis identified japonica rice varieties achieving Na+/K+ homeostasis without the allelic make-up of the salt tolerant indica variety Nona Bokra. Theor. Appl.

WWW.CROPS.ORG

1673

Genet. 123:881–895. doi:10.1007/s00122-011-1634-4 Ali, M.L., A.M. McClung, M.H. Jia, J.A. Kinmball, S.R. McCouch, and G.C. Eizenga. 2011. A rice diversity panel evaluated for genetic and agro-morphological diversity between subpopulations and its geographic distribution. Crop Sci. 51:2021–2035. Association Française de Normalisation (AFNOR). 2011. Qualité des centres de ressources biologiques (CRB). Système de management d’un CRB et qualité des ressources biologiques. NF S96-900. (In French.) AFNOR, Saint Denis, France. Breiman, L., J.H. Friedman, R. Olshen, and C.J. Stone. 1984. Classification and regression trees. Wadsworth and Brooks, Pacific Grove, CA. Breseghello, F., and M.E. Sorrels. 2006. Association mapping of kernel size and milling quality in wheat (Triticum aestivum L.) cultivars. Genetics 172:1165–1177. doi:10.1534/ genetics.105.044586 Ebana, K., Y. Kojima, S. Fukuoka, T. Nagamine, and M. Kawase. 2008. Development of a mini-core collection of Japanese rice landraces. Breed. Sci. 58:281–291. doi:10.1270/jsbbs.58.281 Evanno, G., S. Regnaut, and J. Goudet. 2005. Detecting the number of clusters of individuals using the software STRUCTURE: A simulation study. Mol. Ecol. 14:2611–2620. doi:10.1111/ j.1365-294X.2005.02553.x Excoffier, L., G. Laval, and S. Schneider. 2006. Arlequin v3.1. An integrated software for population genetic data analysis. University of Bern, Switzerland. Faivre-Rampant, O., G. Bruschi, P. Abbruscato, S. Cavigiolo, A.M. Picco, L. Borgo, E. Lupotto, and P. Piffanelli. 2011. Assessment of genetic diversity in Italian rice germplasm related to agronomic traits and blast resistance (Magnaporthe grisea). Mol. Breed. 27:233–246. doi:10.1007/s11032-010-9426-0 Feyt, H., G. Clément, M. Aguilar Portero, R. Ballesteros, L. Martins da Silva, D. Ntanos, S. Russo, M. de Mar Catala, F. Mazzini, and E. Gozé. 2001. Agronomic, morphological and technological traits of 430 genotypes of European rice genetic resource collections: Mean value and genotype × environment interactions. In: Proceedings of the Eurorice Symposium, Krsanodar, Russia. 3–8 Sept. 2001. Centre de Coopération Internationale en Recherche Agronomique pour le Développement, Montpellier, France. p. 40–51. Garris, A.J., S.R. McCouch, and S. Kresovich. 2003. Population structure and its effect on haplotype diversity and linkage disequilibrium surrounding the xa5 locus of rice (Oryza sativa L.). Genetics 165:759–769. Garris, A.J., T.H. Tai, J. Coburn, S. Kresovich, and S. McCouch. 2005. Genetic structure and diversity of O. sativa. Genetics 169:1631–1638. doi:10.1534/genetics.104.035642 Giarrocco, L.E., M.A. Marassi, and G.L. Salerno. 2007. Assessment of the genetic diversity in Argentine rice cultivars with SSR markers. Crop Sci. 47:853–860. doi:10.2135/ cropsci2005.07.0198 Glaszmann, J.-C. 1987. Isozymes and classification of Asian rice varieties. Theor. Appl. Genet. 74:21–30. doi:10.1007/ BF00290078 Glaszmann, J.-C., and M. Arraudeau. 1986. Rice plant type variation: Japonica-Javanica relationships. Rice Genet. Newsl. 3:41–43. Glaszmann, J.-C., T. Mew, H. Hibino, C.K. Kim, T.I. Mew, C.H. Vera Cruz, J.-L. Notteghem, and J.M. Bonman. 1995. Molecular variation as a diverse source of disease resistance in cultivated rice. In: Rice genetics III. IRRI, Los Baños,

1674

Philippines. p. 460–466. Hamblin, M.T., M.L. Warburton, and E.S. Buckler. 2007. Empirical comparison of simple sequence repeats and single nucleotide polymorphisms in assessment of maize diversity and relatedness. PLoS One 12:e1367 doi:10.1371/journal. pone.0001367 Harushima, Y., M. Nakagarha, M. Yano, T. Sasaki, and N. Kurata. 2002 Diverse variation of reproductive barriers in three intraspecific rice crosses. Genetics 160, 313–322. Heff ner, E.L., M.E. Morrells, and J.L. Jannink. 2009. Genetic selection for crop improvement. Crop Sci. 49:1–12. doi:10.2135/cropsci2008.08.0512 Huang, X., X. Wei, T. Sang, Q. Zhao, Q. Feng, Y. Zhao, C. Li, C. Zhu, T. Lu, Z. Zhang, M. Li, D. Fan, Y. Guo, A. Wang, L. Wang, L. Deng, W. Li, Y. Lu, Q. Weng, K. Liu, T. Huang, T. Zhou, Y. Jing, W. Li, Z. Lin, E.S. Buckler, Q. Qian, Q.F Zhang, J. Li, and B. Han. 2010. Genome-wide association studies of 14 agronomic traits in rice landraces. Nature Genet. 42(11):961–969. doi:10.1038/ng.695 Huang, X., Y. Zhao, X. Wei, C. Li, A. Wang, Q. Zhao, W. Li, Y. Guo, L. Deng, C. Zhu, D. Fan, Y. Lu, Q. Weng, K. Liu, T. Zhou, Y. Jing, L. Si, G. Dong, T. Huang, T. Lu, Q. Feng, Q. Qian, J. Li, and B. Han. 2011. Genome-wide association study of flowering time and grain yield traits in a worldwide collection of rice germplasm. Nature Genet. 44(1)1:32–39. Illumina Inc. 2005. Illumina BeadStudio genotyping module. Illumina, San Diego, CA. International Rice Genome Sequencing Project (IRGSP). 2005. The map-based sequence of the rice genome. Nature 436:793–800. doi:10.1038/nature03895 International Rice Research Institution (IRRI). 2002. Rice almanac. IRRI, Los Baños, Philippines. Jackson, M.T., J.L. Pham, H.J. Newbury, B.V. Ford-Lloyd, and P.S. Virk. 1999. A core collection for rice: Needs, opportunities and constraints. In: R.C. Johnson and T. Hodgkin, editors, Core collection for today and tomorrow. IPGRI, Rome, Italy. Jayamani, P., S. Negrão, M. Martins, B. Maçãs, and M.M. Oliveira. 2007. Genetic relatedness of Portuguese rice accessions from diverse origins as assessed by microsatellite markers. Crop Sci. 47:879–886. doi:10.2135/cropsci2006.04.0236 Jin, L., Y. Lu, P. Xiao, M. Sun, H. Corke, and J. Bao. 2010 Genetic diversity and population structure of a diverse set of rice germplasm for association mapping. Theor. Appl. Genet. 121:475–487. doi:10.1007/s00122-010-1324-7 Kang, H.M., N.A. Zaitlen, C.M. Wade, A. Kirby, D. Heckerman, M.J. Daly, and E. Eskin. 2008. Efficient control of population structure in model organism association mapping. Genetics 178:1709–1723. doi:10.1534/genetics.107.080101 Kovach, M.J., M.T. Sweeney, and S.R. McCouch. 2007. New insights into the history of rice domestication. Trends Genet. 22(1):578–587. doi:10.1016/j.tig.2007.08.012 Liu, K., and S.V. Muse. 2005. PowerMarker: Integrated analysis environment for genetic marker data. Bioinformatics 31(9):2128–2129. doi:10.1093/bioinformatics/bti282 Lu, H., M.A. Redus, J.R. Coburn, J.N. Rutger, S.R. McCouch, and T.H. Tai. 2005. Population structure and breeding pattern of 145 US rice cultivars based on SSR marker analysis. Crop Sci. 45:66–76. doi:10.2135/cropsci2005.0066 Mantegazza, R., M. Biloni, F. Grassi, B. Basso, B.R. Lu, X.X. Cai, F. Sala, and A. Spada. 2008. Temporal trend of variation in Italian rice germplasm over the past two centuries revealed by AFLP and SSR markers. Crop Sci. 48:1832–1840.

WWW.CROPS.ORG

CROP SCIENCE, VOL. 52, JULY– AUGUST 2012

doi:10.2135/cropsci2007.09.0532 Mather, K.A., A.L. Caicedo, N.R. Polato, K.M. Olsen, S. McCouch, and M.D. Purugganan. 2007. The extent of linkage disequilibrium in rice (Oryza sativa L.). Genetics 177:2223–2232. doi:10.1534/genetics.107.079616 McCouch, S.R., X. Chen, O. Panaud, S. Temnykh, Y. Xu, Y.G. Cho, N. Huang, T. Ishii, and M.W. Blair. 1997. Microsatellite marker development, mapping and applications in rice genetics and breeding. Plant Mol. Biol. 35:89–99. doi:10.1023/A:1005711431474 McNally, K.L., K.L. Childs, R. Bonhert, R.M. Davidson, K. Zhao, V.J. Ulat, G. Zeller, R.M. Clark, D.R. Hoen, T.E. Bureau, R. Stokowski, D.G. Ballinger, K.A. Frazer, D.R. Cox, B. Padhukasahasram, C.D. Bustamante, D. Weigel, D.J. Mackill, R.M. Bruskievich, G. Rätsch, C.R. Buell, H. Leung, and J.E. Leach. 2009. Genomewide SNP variation reveals relationships among landraces and modern varieties of rice. P.N.A.S. 106(30):12273–12278. doi:10.1073/ pnas.0900992106 Meuwissen, T.H.E., B. Hayes, and M.E. Goddard. 2001. Prediction of total genetic value using genome-wide dense marker maps. Genetics 157:1819–1829. Oka, H.I. 1958. Intravarietal variation and classification of cultivated rice. Indian J. Genet. Plant Breed. 18:78–89. Pejic, I, P. Ajmone-Marsan, M. Morgante, V. Kozumplick, P. Castiglioni, G. Taramino, and M. Motto. 1998. Comparative analysis of genetic similarity among maize inbred line detected by RFLPs, RAPDs, SSRs and AFLPs. Theor. Appl. Genet. 97:1248–1255. doi:10.1007/s001220051017 Perrier, X., and J.-P. Jacquemoud-Collet. 2006. DARwin software.

Centre de Coopération Internationale en Recherche Agronomique pour le Développement, Montpellier, France. Pritchard, J.K., M. Stephens, and P. Donnelly. 2000a. Inference of population structure using multilocus genotype data. Genetics 155:945–959. Pritchard, J.K., M. Stephens, N.A. Rosenberg, and P. Donnelly. 2000b. Association mapping in structured populations. Am. J. Hum. Genet. 67:170–181. doi:10.1086/302959 Rosenberg, N.A., L.M. Li, R. Ward, and J.K. Pritchard. 2003. Informativeness of genetic markers for inference of ancestry. Am. J. Hum. Genet. 73:1402–1422. doi:10.1086/380416 Roy, R., D.L. Steffens, B. Gratside, G.Y. Jang, and J.A. Brumbaugh. 1996. Producing STR locus patterns from bloodstains and other forensic samples using an infrared fluorescent automated DNA sequencer. J. Forensic Sci. 41:418–424. Spada, A., R. Mantegazza, M. Biloni, E. Caporali, and F. Sala. 2004. Italian rice varieties; historical data, molecular markers and pedigrees to reveal their genetic relationships. Plant Breed. 123:105–111. doi:10.1046/j.1439-0523.2003.00950.x Van Hintum, T.J.L. 2007. Data resolution: A jackknife procedure for determining the consistency of molecular marker datasets. Theor. Appl. Genet. 115:343–349. doi:10.1007/s00122-0070566-5 Van Inghelandt, D., A.E. Melchinger, C. Lebreton, and B. Stich. 2010. Population structure and genetic diversity in a commercial maize breeding program assessed with SSR

CROP SCIENCE, VOL. 52, JULY– AUGUST 2012

and SNP markers. Theor. Appl. Genet. 120:1289–1299. doi:10.1007/s00122-009-1256-2 Varshney, R.K., M. Baum, P. Guo, S. Grando, S. Cecarelli, and A. Graner. 2010. Features of SNP and SSR diversity in a set of ICARDA barley germplasm collection. Molecular Breeding 26:229–242. doi:10.1007/s11032-009-9373-9 Wright, S. 1978. Evolution and the genetics of populations. Variability within and among natural populations. University of Chicago Press, Chicago, IL. Xu, Y. 2002. Global view of QTL: Rice as a model. In: M.S. Kang, editor, Quantitative genetics, genomics and plant breeding, CAB International, London, UK. p. 109–132. Yamamoto, T., J. Yonemura, and M. Yano. 2009. Towards the understanding of complex traits in rice: Substantially or superficially? DNA Research 16:141–154. doi:10.1093/ dnares/dsp006 Yan, W., J.N. Rutger, R.J. Bryant, H.E. Bockelman, R.G. Fjellstrom, M.H. Chen, T.H. Tai, and A.M. McClung. 2007. Development and evaluation of a core subset of the USDA rice germplasm collection. Crop Sci. 47:869–878. doi:10.2135/ cropsci2006.07.0444 Yu, J., G. Pressoir, W.H. Briggs, B.I. Vroh, M. Yamasaki, J.F. Doebley, M.D. McMullen, B.S. Gaut, D.M. Nielsen, J.B. Holland, S. Kresovich, and E.S. Buckler. 2006. A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat. Genet. 38:203–208. doi:10.1038/ng1702 Zhang, H., J. Sun, M. Wang, D. Liao, Y. Zeng, S. Shen, P. Yu, P. Mu, X. Wang, and Z. Li. 2006. Genetic structure and phylogeography or rice landraces in Yunnan, China revealed by SSR. Genome 50:72–83. Zhang, H., D. Zhang, M. Wang, J. Sun, Y. Qi, J. Li, X. Wei, L. Han, Z. Qiu, S. Tang, and Z. Li. 2011a. A core collection and mini core collection of Oryza sativa L. in China. Theor. Appl. Genet. 122:49–61. doi:10.1007/s00122-010-1421-7 Zhang, P., J. Li, X. Li, X. Liu, X. Zhao, and Y. Lu. 2011b. Population structure and genetic diversity in a rice core collection (Oryza sativa L.) investigated with SSR markers. PLoS One 6(12):e27565. doi:10.1371/journal.pone.0027565 Zhao, K., C.W. Tung, G.C. Eizenga, M.H. Wright, M. Liakat Ali, A.H. Price, G.J. Norton, M. Rafiqul Islam, A. Reynolds, J. Mezey, A.M. McClung, C.D. Bustamante, and SR. McCouch. 2011. Genome-wide association mapping reveals a rich genetic architecture of complex traits in Oryza sativa. Nat. Commun. 2:467. doi:10.1038/ncomms1467 Zhao, K., M. Wright, J. Kimball, G. Eizenga, A. McClung, M. Kovach, W. Tyagi, M. Likat Ali, C.W. Tung, A. Reynolds, C.D. Bustamante, and S.R. McCouch. 2010. Genomic diversity and introgression in O. sativa reveal the impact of domestication and breeding on the rice genome. PLoS One 5(5):e10780. doi:10.1371/journal.pone.0010780 Zhu, C., M. Gore, E.S. Buckler, and J. Yu. 2008. Status and prospect of association mapping in plants. Plant Genome 1:5– 20. doi:10.3835/plantgenome2008.02.0089

WWW.CROPS.ORG

1675

Suggest Documents