On the Abundance and Distribution of Transposable Elements in the Genome of Drosophila melanogaster

On the Abundance and Distribution of Transposable Elements in the Genome of Drosophila melanogaster Carolina Bartolome´, Xulio Maside, and Brian Charl...
Author: Gwendoline Boyd
2 downloads 1 Views 95KB Size
On the Abundance and Distribution of Transposable Elements in the Genome of Drosophila melanogaster Carolina Bartolome´, Xulio Maside, and Brian Charlesworth Ashworth Laboratories, Institute of Cell, Animal and Population Biology, University of Edinburgh The abundance and distribution of transposable elements (TEs) in a representative part of the euchromatic genome of Drosophila melanogaster were studied by analyzing the sizes and locations of TEs of all known families in the genomic sequences of chromosomes 2R, X, and 4. TEs contribute to up to 2% of the sequenced DNA, which corresponds roughly to the euchromatin of these chromosomes. This estimate is lower than that previously available from in situ data and suggests that TEs accumulate in the heterochromatin more intensively than was previously thought. We have also found that TEs are not distributed at random in the chromosomes and that their abundance is more strongly associated with local recombination rates, rather than with gene density. The results are compatible with the ectopic exchange model, which proposes that selection against deleterious effects of chromosomal rearrangements is a major force opposing element spread in the genome of this species. Selection against insertional mutations also influences the observed patterns, such as an absence of insertions in coding regions. The results of the analyses are discussed in the light of recent findings on the distribution of TEs in other species.

Introduction Transposable elements (TEs) are a major component of the genomes of most species (Berg and Howe 1989; Capy 1997; Adams et al. 2000; Duret, Marais, and Bie´mont 2000; Lander et al. 2001) and are thought to play an important role in genome evolution (Flavell 1986; Spradling 1994; Steinemann M and Steinemann S 1998; Andolfatto, Wall, and Kreitman 1999; Ca´ceres et al. 1999; Nekrutenko and Li 2001). However, the nature of the evolutionary forces that control their abundance is still not well understood. Population surveys in Drosophila melanogaster have shown that TEs are usually present at very low frequencies at the sites where they are inserted and that the overall rate of transposition is of the order of 1024 transpositions per element copy per generation (Nuzhdin and Mackay 1995; Maside, Assimacopoulos, and Charlesworth 2000; Maside et al. 2001). These results are broadly compatible with theoretical models that propose that selection is a key force in element number containment (Charlesworth, Sniegowski, and Stephan 1994). Under these models, selection counteracts element spread by transposition by acting against (1) deleterious mutations caused by insertions into or nearby genes (Charlesworth B and Charlesworth D 1983; Kaplan and Brookfield 1983), (2) deleterious chromosome rearrangements induced by ectopic exchange between TEs of the same family inserted in nonhomologous locations (Langley et al. 1988; Montgomery et al. 1991), and (3) direct deleterious effects of transposition on fitness (Brookfield 1991, 1996). However, despite the large body of theoretical and experimental results, the relative importance of these factors is still a matter for debate All authors contributed equally to this work. Key words: population genetics, transposable elements, Drosophila melanogaster, selection, ectopic exchange. Address for correspondence and reprints: Xulio Maside, Ashworth Laboratories, Institute of Cell, Animal and Population Biology, Kings Buildings, University of Edinburgh, Edinburgh EH9 3JT, UK. E-mail: [email protected]. Mol. Biol. Evol. 19(6):926–937. 2002 q 2002 by the Society for Molecular Biology and Evolution. ISSN: 0737-4038

926

(Bie´mont et al. 1994; Hoogland and Bie´mont 1996; Bie´mont et al. 1997a; Charlesworth, Langley, and Sniegowski 1997; Duret, Marais, and Bie´mont 2000; Maside et al. 2001). A third finding of most population surveys is that TEs are not distributed at random within the genomes of the organisms which have been most intensively studied, such as D. melanogaster (Charlesworth, Lapid, and Canada 1992; Bie´mont et al. 1994; Vieira and Bie´mont 1996; Maside et al. 2001), Caenorhabditis elegans (Duret, Marais, and Bie´mont 2000), humans (Boissinot, Entezam, and Furano 2001; Lander et al. 2001), and plants like Arabidopsis thaliana (S. I. Wright, N. Agrawal, and T. E. Bureau, personal communication). Some interesting patterns have been described, and given the different predictions of the models, these may shed some light on the nature of the selective mechanisms involved in the control of element abundance. Under the ectopic exchange model, elements are expected to be more abundant in regions of reduced recombination, where they are less likely to be involved in unequal recombination events (Langley et al. 1988). Under the insertional mutation model, element abundance is expected to be negatively associated with two main factors: gene density and the rate of recombination. In regions of high gene density, TEs are more likely to insert into genes or regulatory sequences and hence to be targets of selection. In addition, interference between sites under selection in regions with low recombination rates is expected to weaken the efficiency of selection in eliminating slightly deleterious mutations, such as those induced by TEs (the Hill-Robertson effect, Hill and Robertson 1966; Gordo and Charlesworth 2001; see also Duret, Marais, and Bie´mont 2000 for a discussion in relation to TE distribution). The transposition-selection model has been developed in relation to DNA-based elements and proposes that the mode of selection is the deleterious effect of tranposase activity (Brookfield 1991, 1996). No specific predictions of this model with regard to element distribution have been made. However, given that it assumes that only element insertions in ‘‘safe’’ locations (where they do not cause harmful mutations) will persist in pop-

Distribution of TEs in Drosophila melanogaster

ulations (Brookfield 1991) and that selection may be weaker in regions of reduced recombination, it is reasonable to assume that the predictions of this model are similar to those of the insertional mutation model. We therefore only consider the predictions of the insertional mutation model in what follows. In Drosophila, in situ hybridization data have provided evidence for TE accumulation in regions of reduced recombination, such as chromosomal inversions (Eanes, Wesley, and Charlesworth 1992; Sniegowski and Charlesworth 1994) and the tips and bases of the major autosomes (Langley et al. 1988; Charlesworth, Lapid, and Canada 1992; Maside et al. 2001, but see Hoogland and Bie´mont 1996 for a different view). Furthermore, recent studies of the genome sequences of two different species have revealed different patterns. In C. elegans, a weak positive correlation between TE abundance and recombination rate has been described (Duret, Marais, and Bie´mont 2000); in A. thaliana, element abundance has been found to be more intensively associated with gene density than with recombination rate (S. I. Wright, N. Agrawal, and T. E. Bureau, personal communication). Here, we report a study of the abundance and distribution of all known TE families in the genomic DNA sequence of chromosomes X, 4, and 2R (chosen at random as representative of the major autosomal arms) of D. melanogaster released by the Drosophila Genome Project (Adams et al. 2000). Taking advantage of the high level of resolution of this data set, we have analyzed the associations between TE abundance, gene density, and recombination rate. The implications of the results in relation to the different population genetic models of element dynamics are discussed. Materials and Methods Several tools were used to retrieve data from the complete DNA sequence released by Celera Genomics and the Drosophila Genome Project (release 2). Geneseen (http://flybase.bio.indiana.edu/annot/geneseenlaunch-static.html) is a Java applet that displays the locations and lengths of the genes (including introns plus exons) and transposable elements along the chromosome arms. A systematic search of this database enabled us to gather information about the estimated sizes and locations of insertions of all known families of TEs (DNA-based elements, retrotransposons with LTRs and retroelements without LTRs) and their relative positions with respect to the genes in the DNA sequences of chromosomes 2R, X, and 4. Gadfly (http://flybase.bio.indiana.edu/annot/bands. html) provided a complete list of genes by cytological map position and an estimate of the mRNA length for each gene. When two or more predicted genes overlapped, we always selected the largest annotation, discarding the rest. The annotated sequence of each chromosome arm was divided into sections of 50 kb where genes and TEs were located. The data available covered regions 1A5–20B1, 41A1–60F3, and 101F–102F8 of chromosomes X, 2R, and 4, respectively.

927

To assess the accuracy of the TE annotation, we reviewed the insertions of all TE families present in one megabase of each chromosome. These evaluations were done by Blasting (http://www.ncbi.nlm.nih.gov/gorf/ bl2.html) the Drosophila transposable element sequences (http://www.fruitfly.org/sequence/dlMisc.shtml) with the sequence scaffolds that cover the megabase under analysis (Flybase). All the annotations were consistently accurate in size, location, and family attribution, except for TartB1 and roo. A fraction of the annotations for these two families corresponded to tandem repeats of trinucleotide motifs (CAG and CAA) present in a particular region of the elements as well as at several other sites in the genomic sequence. These two motifs are glutamine codons, which have been found to be widespread as similar reiterative arrangements in the genomes of many eukaryotes (Green and Wang 1994). As a precaution, we reanalyzed the insertions attributed to these two families by Blasting the sequences of the two elements (the reference sequences for roo and Tart-B1 are FBgn0000155 and FBgn0004904, respectively) against the corresponding genomic scaffolds. All predicted insertions, which were exclusively caused by homology between the polyglutamine motifs and did not involve any other part of the elements, were considered spurious and eliminated from our data. Results Distribution of Elements in the Genome A proportion of element sequences in Release 2 correspond to a consensus, rather than the true sequence. This may introduce some uncertainty with regard to length estimates of TE insertions. As a consequence, TE abundance in the genome was estimated in two separate ways (1) by measuring the proportion of DNA derived from TEs in each 50 kb interval of the sequence, (2) by counting the number of TE insertions along the sequences of the chromosomes. These two measures complement each other; the proportion of TE-derived DNA gives a description of the composition of the genomic DNA, and the number of insertions provides an insight into where elements are more likely to persist in the long term. In addition, we used a broad criterion for complete copies, considering as such those which corresponded to over 80% of the full element sequence (Duret, Marais, and Bie´mont 2000). We have identified 613 insertions of elements from 39 different families (table 1). Retrotransposons are the most abundant class and account for 64% of the total number of TE insertions. The most represented family is 1360, with 87 copies. Overall, 17% of them corresponded to complete copies. This fraction is significantly higher among transposons (22%) than retroelements (12%) and among those inserted on the X and 4 than on the 2R. Given the problems associated with the annotation of TE sequences referred to above, these estimates should be viewed with caution. This low proportion of full-length insertions is in agreement with previous observations on a short intergenic sequence of

928

Bartolome´ et al.

Table 1 Profile of Element Insertions on Chromosomes 2R, X, and 4 2R TE FAMILY Transposons . . . . . . . . . . . . . . . . . . 1360 Bari 1 Foldback HB1 Hobo Hopper Pogo S element Retroelements with LTRs . . . . . . . 17.6 297 412 1731 Aurora BEL Blastopia Blood Burdock Copia Gypsy mdg1 mdg3 Micropia Opus RTLE* Roo Tirant ZAM Retroelements without LTRs . . . . BS D element Doc FW G element Helena Het-A I factor Jockey RI RII Tart-B1 Total . . . . . . . . . . . . . . . . . . . . . . . .

X

4

POOLED

N

FL

N

FL

N

FL

N

FL

34 5 14 8 5 4 5 9 7 8 9 1 3 7 9 2 5 6 5 9 3 1 13 12 10 2 4 8 3 24 17 12

0.03 0.40 0.00 0.13 0.00 0.50 0.20 0.11 0.00 0.00 0.67 0.00 0.00 0.00 0.00 0.00 0.00 0.33 0.00 0.33 0.00 0.00 0.00 0.08 0.60 0.00 0.25 0.00 0.00 0.21 0.06 0.08

15 2 15 10 5 4 7 9 7 19 8 2 1 6 7 1 4 4 5 4 5

0.07 0.50 0.40 0.20 0.20 0.50 0.00 0.22 0.00 0.11 0.50 0.00 0.00 0.33 0.29 0.00 0.00 0.50 0.00 0.25 0.40

38 6 6 14

0.34 0.33 0.00 0.14

6

0.17

2

0.00

2

0.50

1 1

0.00 0.00

0.00

1

1

1 5 1

0.00 0.00 0.00

0.00 0.14 0.00 0.00

0.00 0.00 0.53 0.00 0 0.00 0.00 1.00 1.00 0.00 0.00 0.00 0.20 0.14 0.00

2

10 14 8 1

1 5 15 6 1 7 6 3 2 1 1 1 10 14 5

2 6 3

0.50 0.00 0.00

297

0.12

218

0.22

1 98

0.00 0.21

87 13 35 32 10 8 12 24 14 29 17 3 4 13 16 3 11 10 11 14 8 1 14 17 27 8 6 15 9 28 24 14 1 1 22 34 16 1 1 613

0.17 0.38 0.17 0.16 0.10 0.50 0.08 0.17 0.00 0.07 0.59 0.00 0.00 0.15 0.13 0.00 0.09 0.40 0.00 0.29 0.25 0.00 0.00 0.06 0.52 0.00 0.33 0.00 0.00 0.29 0.13 0.07 0.00 0.00 0.14 0.12 0.00 0.00 0.00 0.17

NOTE.—N is the number of insertions and FL the fraction of full length copies. * Retrotransposon–like element (GenBank accession number: AJ010298).

maize (SanMiguel et al. 1998) and the genome of C. elegans (Duret, Marais, and Bie´mont 2000). TEs contribute to a significant fraction, 2.1 6 0.18% (mean of the estimates for each 50 kb interval 6 standard error [SE]), of the sequenced genome of D. melanogaster. This contribution is not homogeneous be-

FIG. 1.—Fraction (%) of TE-derived DNA along the genomic sequences of chromosomes 2R, X, and 4 of D. melanogaster.

cause it varies significantly among and within the chromosome arms. The fraction of TE-derived DNA in chromosome 4 (6.3 6 0.67%) is three times higher than in 2R (2.1 6 0.29%), its contribution to the X chromosome being the lowest (1.8 6 0.20%). In addition, the observed density of TE-derived DNA is higher at both ends than in the middle sections of the major chromosomes (fig. 1). The distribution of the numbers of TE insertions shows a similar pattern (table 2). There is an excess of elements on chromosome 4 and a deficit on the X, considering their relative sizes. These deviations from randomness are not exclusive of any particular TE type and were detected in the distribution of the three TE classes: DNA-based elements (transposons), retroelements with LTRs, and retroelements without LTRs. One way to quantify the accumulation of TEs on chromosome 4 is to compare copy numbers on this chromosome with those found in other autosomal regions,

Distribution of TEs in Drosophila melanogaster

Table 2 Distribution of TE Insertions Among Chromosomes 2R, X, and 4

Elements Accumulate in Regions of Low Recombination

CHROMOSOME

DNA-TEs . . . . . . RTs non-LTR . . . . . LTR . . . . . . . . . All families. . . . .

2R

X

4

P

O E

84 103.9

67 111.2

70 5.9

***

O E O E O E

98 78.9 115 105.3 297 288.1

51 84.5 100 112.7 218 308.5

19 4.5 9 6.0 98 16.5

*** * ***

NOTE;—O: Observed; E: Expected; x2 test: *P , 0.05, **P , 0.01, ***P , 0.001.

where the rate of recombination is not reduced, i.e., the highly recombining portion of 2R. Chromosome 4 harbors 49% of the total number of insertions found in these two regions (98/200, table 5), but it only represents 6% of their total DNA sequence (1,165 out of 17,605 kb, table 3). In other words, TE abundance in chromosome 4 is eight times higher than in regions of nonsuppressed recombination of 2R. Similarly, it can be shown that this accumulation affects all TE classes but is especially intense for DNA-based elements (a greater than 12-fold increase on 4) and is least severe for LTR retroelements (a slightly over twofold enrichment). The accumulation of TEs on 4 may be driving the highly significant values of the chi-square of goodnessof-fit tests in table 2, raising the question of whether or not elements are evenly distributed between the X and 2R. In a similar analysis excluding chromosome 4, a significant deficit of elements on the X was detected (x2 5 19.3; P , 0.001). Interestingly, although the deficit was observed for all the TE classes, it was statistically significant only for LTR-retroelements (x2 5 17.3, P , 0.001), deviations among DNA transposons and nonLTR retroelements being nonsignificant (P , 0.10 in both cases).

In D. melanogaster the rate of recombination is strongly reduced near the centromere and at the tips of the major chromosomes as well as along the whole fourth chromosome (Hochman 1976; Lindsley and Sandler 1977; Ashburner 1989, pp. 451–496). In order to determine if there is any association between the TE distribution and the local recombination rate, we compared TE abundances in regions with different rates of recombination. On the basis of the gradient of recombination frequency with respect to physical position along the chromosomes described by Charlesworth (1996), we divided the annotated sequence from chromosomes X, 2R, and 4 into three different categories. They are (1) high recombination rate regions: where recombination map distances and physical separation are related by a linear function with a constant of proportionality greater than or equal to 0.6 (chromosomal regions 3 and 4 from X and 8 from 2R in the notation of Charlesworth 1996), (2) null recombination regions: virtually nonrecombining regions, usually at the extreme tip and bases of the major chromosomes (chromosomal regions 1 and 6 from X and 6 from 2R according to Charlesworth 1996, and the whole fourth chromosome), and (3) reduced recombination rate regions: chromosomal regions where the recombination rate is reduced compared with the high recombination rate regions but is still detectable (regions 2 and 5 from X and 6, 7, and 9 from 2R, Charlesworth 1996). The cytological and genetic boundaries, sequence intervals, and length (in kb) of the different regions are given in table 3. The first genes mapped to each of the chromosomal bands used as cytological landmarks were arbitrarily designated as the boundaries between regions in the annotated sequence. The null, reduced, and high regions represented 5%, 16%, and 79% of the DNA sequences analyzed, respectively. Our data suggest a strong negative association between element abundance and the local recombination

Table 3 Division of the Annotated Sequence of Chromosomes 2R, X, and 4 into Regions of Null, Reduced, and High Recombination Rates 2R. . . .Recombination rate Regions included Sequence available Gene used as boundary Boundaries in kb Fraction of arm represented X. . . . .Recombination rate Regions included Sequence available Gene used as boundary Boundaries in kb Fraction of arm represented 4 . . . . .Recombination rate Regions included Sequence available Gene used as boundary Boundaries in kb Fraction of arm represented

Null 40A1–41E1 41A1–41E1 CG17482 1–950 0.047 Null 1A1–1B4 1A5–1B4 CG3038 1–200 0.009 Null 101C–102F8 101F–102F8 plexB–CG11231 I–1156 1.00

929

Reduced 41E1–42F3 41E1–42F3 CG11665 951–2250 0.064 Reduced 1B4–3C2 1B4–3C2 1(1) 1Bb 201–2500 0.106

High 42F3–59F8 42F3–59F8 pk 2,251–18,700 0.813 High 3C2–19D3 3C2–19D3 CG2716 2,501–20,050 0.810

Reduced 59F8–60F5 59F8–60F3 CG9850 18,701–20,234 0.076 Reduced 19D3–20C1 19D3–20B1 mal 20,051–21,668 0.075

Null 20C1–20F

930

Bartolome´ et al.

are significantly more abundant than random expectation on chromosome 4 (table 2) and are underrepresented in the highly recombining regions of 2R and X (table 5). Furthermore, this effect is not exclusively caused by the high element numbers in the null regions. When data from the null regions are eliminated, a significant excess of TE insertions in regions of reduced versus high recombination is still detected for both chromosomes. This deviation was observed in the distributions of elements from the three different classes (x2 of goodness of fit, P , 0.001 in all cases). Gene Density and Recombination Rate

FIG. 2.—TE abundance with respect to recombination rate in chromosomes 2R, X, and 4 of D. melanogaster: a, percentage of TEderived DNA; b, average number of element insertions, measured in 50 kb intervals. Error bars indicate the standard errors.

rate. In regions of null recombination, TEs make up to 8.8 6 1.2% of the DNA, a fraction significantly higher than in regions where recombination is reduced (3.3 6 0.6%) or is high (1.4 6 0.1%) (t-test, P , 0.05 in all comparisons) (fig. 2a). Table 4 shows the relative contribution of TEs to genomic DNA in the three regions of each major chromosome. The fraction of TE-derived DNA is significantly higher in the null recombination regions (pooled data), although this effect seems to be weaker on the X. However, it should be noted that the null recombination region for this chromosome is only 200 kb long (table 3), which allowed for only four estimates to be taken (one in each 50 kb interval, see Materials and Methods), causing the high variance in their mean. A similar pattern was detected in the analysis of the number of TE insertions (fig. 2b). Element insertions

One possible explanation for the observed accumulation of TEs in regions of reduced recombination could be that there are fewer genes in these regions, and elements are therefore less likely to cause deleterious effects by inserting into genes or regulatory sequences. To test this hypothesis, we studied the fractions of coding, intronic, and intergenic DNA in the different chromosomal regions. On chromosomes 2R and X, gene density is positively correlated with recombination rate. As shown in table 6, in regions where recombination is not suppressed (reduced and high recombination regions), the fraction of coding and intronic DNA is much larger than that in the null regions. This increase is caused by greater number of genes per unit length of genomic DNA, rather than changes in the average size of genes (coding plus intronic DNA), which does not vary significantly among regions. This implies that the fraction of intergenic DNA in these regions is significantly smaller than in those where recombination is suppressed, and so there are fewer places for elements to insert without causing a detectable deleterious effect on the host. However, the distribution of coding and noncoding sequences on chromosome 4 does not fit this pattern. Despite the lack of recombination, coding DNA density in this chromosome is similar to that in the high recombination regions of 2R and X (table 6). In this case, the effect is not caused by a greater number of genes but by longer ones. Both introns and coding sequences from this chromosome are, on an average, significantly longer than those from genes located on 2R or X. In fact, the average length of a chromosome 4 gene is 7.0 6 0.85 kb, almost twice the size of genes from the 2R and X (3.6 6 0.13 and 3.4 6 0.12, respectively) (t-test, P , 0.001 in both cases). Therefore, the buildup in TE numbers observed on this chromosome is not related to a greater fraction of intergenic DNA.

Table 4 Fraction (%) of TE-Derived DNA in Genomic Regions with Different Rates of Recombination 2R

Percentage of TE-derived DNAa. . . . . . . . . . . . . . . . . . . . . . SE . . . . . . . . . . . . . . . . . . . . . . . . . . a

X

4

POOLED

Null

Reduced

High

Null

Reduced

High

Null

Null

Reduced

High

11.8 2.28

3.7 1.21

1.3 0.20

9.0 5.61

2.9 0.66

1.4 0.19

6.3 1.15

8.8 1.21

3.3 0.65

1.4 0.14

Average of the values obtained from each 50 kb interval in which the different regions were divided; SE: Standard error.

Distribution of TEs in Drosophila melanogaster

931

Table 5 Distribution of TE Insertions with Respect to Recombination Rate 2R

DNA-TEs . . . . . . O E RTs non-LTR . . . O E LTR . . . . . . . . . . . O E All families. . . . . O E

X

Null

Reduced

High

P

40 3.9 53 4.6 47 5.4 140 13.9

17 11.8 17 13.7 21 16.1 55 41.6

27 68.3 28 79.7 47 93.5 102 241.5

***

4

POOLED

Null Reduced

High

P

Null

Null

Reduced

High

P

1 0.6 0 0.5 3 0.9 4 2.0

33 54.3 37 41.3 67 81.0 137 176.6

***

70 19

***

9

***

98

50 34.7 31 26.3 51 35.1 132 96.1

60 174.5 65 132.7 114 176.9 239 484.0

***

NS

111 11.8 72 9.0 59 12.0 242 32.8

*** *** ***

33 12.1 14 9.2 30 18.1 77 39.4

*** *** ***

NOTE.—O: Observed; E: Expected; x2 test: *P , 0.05, **P , 0.01, ***P , 0.001; NS: not significant.

TE Contribution to Coding and Noncoding DNA

Discussion

The vast majority of TEs are inserted into noncoding sequences. As expected, given the known strong deleterious effects of insertions into coding sequences, the TE contribution to this fraction of the genomic DNA is negligible. In fact, only one insertion out of the 613 detected occurred into a coding sequence—a D element in the 39 end of BcDNA: GH06193 (Flybase ID: FBgn0027581), a gene of unknown function located on band 50B of 2R. This implies that the observed accumulation of elements takes place exclusively in the noncoding DNA. But noncoding DNA can be divided into two classes (1) introns, which are transcribed but spliced out before translation, and (2) intergenic DNA, much of which may be devoid of any functional sequences other than regulatory motifs related to gene expression. To check if they are subject to different functional constraints that may affect the accumulation of TEs, we have analyzed separately the relative contributions of element insertions. Our results suggest that the overall fraction of TE-derived DNA in intergenic sequences is about twice as large as in introns (table 7) and that this difference is smaller in the null recombination regions (in the null recombination region of 2R, elements are as abundant in introns as in the intergenic DNA).

The analysis of the complete genomic DNA sequence of D. melanogaster introduces a new perspective to the study of the parameters that describe the dynamics of TEs in natural populations. The complete sequence represents an instant picture of a reference genome from which one can extract information at an unmatched level of resolution. However, this source of information has its own limitations, which must be discussed in order to correctly interpret the data. The two aspects that most strongly influence the utility of the released sequence with regard to the study reported here concern the assembly of the genomic sequence. First, as explained at length by Myers et al. (2000), repetitive DNA increases the difficulty of obtaining a consensus sequence. Thus, genomic regions rich in this type of sequences, such as the pericentric heterochromatin, have not been resolved yet. In fact, the two releases made available up to now essentially cover only the euchromatin (Adams et al. 2000; see Materials and Methods for a detailed description of the genomic regions included in this analysis). In addition, the presence of some islands of complex DNA, which are difficult to assemble, has left some gaps in the euchromatic sequence. This question has been addressed by Benos et al. (2001), who found that

Table 6 Gene Density with Respect to Recombination Rate 2R Null Coding DNA (%)a . . . . . . . . . . . . . . . . 6.5 SE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.72 Intronic DNA (%)a. . . . . . . . . . . . . . . . 11.4 SE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.09 Intergenic DNA (%)a . . . . . . . . . . . . . . 82.1 SE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.49 Average number of genes per 50kb . . . 2.2 SE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0.39 Average coding DNA per gene (kb). . . 1.5 SE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0.00 Average intronic DNA per gene (kb) 2.6 SE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0.83 a

error.

X

4

Reduced

High

Null

Reduced

High

Null

23.0 2.14 22.8 2.46 54.2 3.77 6.8 0.64 1.7 0.06 1.7 0.21

21.9 0.81 24.6 1.56 53.5 1.86 6.3 0.37 1.7 0.03 1.9 0.13

12.0 6.00 8.7 6.24 79.3 12.24 3.8 1.18 1.6 0.35 1.2 0.53

18.1 1.73 13.9 1.92 68.0 3.06 5.1 0.45 1.8 0.08 1.4 0.19

18.1 0.81 18.0 1.23 63.9 1.67 5.2 0.45 1.7 0.04 1.7 0.12

18.1 2.57 30.9 4.8 51.0 6.83 3.5 0.36 2.6 0.33 4.4 0.66

Average of the values obtained from each 50 kb interval in which the different regions were divided; SE: Standard

932

Bartolome´ et al.

Table 7 Fraction (%) of TE-Derived DNA in Noncoding Sequences 2R

Noncoding DNA . . Introns . . . . . . . . . . Intergenic DNA . . .

X

4

POOLED

Null

Reduced

High

Null

Reduced

High

Null

Null

Reduced

High

12.1 11.1 12.3

3.9 3.2 5.5

1.7 0.7 2.2

10.2 0.0 11.3

3.7 0.8 4.2

1.8 1.1 1.9

8.6 4.9 10.9

10.3 6.1 11.7

3.8 2.1 4.7

1.7 0.8 2.0

some of these gaps correspond to TE insertions. At least 2 out of the 10 gaps detected in the 2.6-Mb surveyed by Benos et al. (2001) were indeed TE insertions (the results for four others were ambiguous, and the remaining four gaps did not involve TEs). In order to determine the consequences of not including the gaps in our study, we performed additional analyses treating them as a separate TE family and compared the results with those obtained from the true TE insertions only (see earlier). This is a very conservative procedure because at least 40% of the gaps studied by Benos et al. (2001) are not TEs and could distort the overall distribution pattern. However, our comparisons revealed that the gaps are distributed over different genomic regions in a pattern similar to that of the rest of the TE families, so that their inclusion in the data set would not cause any significant change in the results reported here (table 8). TE Abundance TEs account for up to 2% of the euchromatic genome of D. melanogaster. This estimate is significantly lower than that from in situ or Southern blotting studies, which are between 6% and 8% (Charlesworth, Jarne, and Assimacopoulos 1994; Maside et al. 2001). This difference is probably because the latter methods can only provide indirect estimates of the TE content of a particular sequence, usually by considering each insertion to represent one full element. This practice is very likely to inflate the estimates of the TE content in terms of amount of DNA, given that, as suggested in the present study, a reduced proportion of the insertions (less than 20%) correspond to complete copies. This new lower estimate suggests that TEs are even more abundant in the heterochromatin than previously suspected (Charlesworth, Jarne, and Assimacopoulos 1994). Assuming that the overall content of TE-derived DNA in the fly genome is close to 7.5% (Manning,

Schmid, and Davidson 1975; Young 1979) and that twothirds of the genome correspond to the euchromatin (Adams et al. 2000), it follows that over 18% of the heterochromatin may be composed of TEs. This reinforces the view that TEs accumulate intensively in this fraction of the genome (Charlesworth, Jarne, and Assimacopoulos 1994; Pimpinelli et al. 1995; Junakovic et al. 1998; Maside et al. 2001). If the sequence gaps were considered as TEs, and included in this calculation, the fraction of TE-derived DNA would be 3.7% and 15.1% for the euchromatin and heterochromatin, respectively. TE Distributions and Selection We have found a strong negative correlation between the local rate of recombination and TE abundance (fig. 2). Elements accumulate in genomic regions where recombination is rare or absent: chromosome 4 and the null and low recombination regions of the X and 2R (tables 4 and 5 and fig. 2). This is consistent with other evidence that natural selection against the deleterious effects of TE insertions is the main force checking element spread (Charlesworth, Lapid, and Canada 1992; Vieira and Bie´mont 1996; Bie´mont et al. 1997b; Nuzhdin, Pasyukova, and Mackay 1997; Maside, Assimacopoulos, and Charlesworth 2000; Boissinot, Entezam, and Furano 2001; Guerreiro and Fontdevila 2001; Maside et al. 2001). This relationship between TE abundance and recombination rate can be explained either by a lower efficiency of selection against deleterious mutational effects of TE insertion in genomic regions where recombination is infrequent (the Hill-Robertson effect, see references above) or by a reduction in the rate of elimination of elements by ectopic exchange in such regions (Langley et al. 1988; Charlesworth, Lapid, and Canada 1992). In addition, most of the elements are inserted into intergenic sequences; of those found in

Table 8 Comparison of the Contributions (% of DNA) from TEs and Sequence Gaps to the Genomic Sequence of D. melanogaster 2R

X

4

ReReNull duced High Whole Null duced High Whole Null TEs . . . . . . . . . 11.8 Gaps . . . . . . . . 13.9 TEs 1 Gaps . . 25.6

3.7 1.0 4.6

1.3 1.0 2.3

1.8 1.6 3.4

9.0 0.0 9.0

2.9 1.7 4.7

1.4 1.6 3.0

2.1 1.6 3.7

6.1 2.3 8.4

TOTAL ReNull duced High Whole 8.6 6.9 15.5

3.3 1.4 4.7

1.4 1.3 2.7

2.1 1.6 3.7

Distribution of TEs in Drosophila melanogaster

genes, the vast majority are in the introns, where they are expected to have only weak deleterious mutational effects (table 7). This must reflect removal of elements because of mutational effects and is consistent with the finding from restriction fragment polymorphism studies that TEs in natural populations are rarely found to be inserted into or near coding sequences (Charlesworth and Langley 1991). There is less discrimination against intronic insertions in the null recombination regions (table 7), suggesting a reduction in the pressure of selection against TE insertions in these regions (see Results). This could, of course, reflect either the effects of ectopic exchange or of Hill-Robertson effects on insertional mutations. An alternative way to test the hypothesis of selection against TE insertions is to compare the relative abundance of TEs on the X chromosome versus the autosomes under the assumption that there will be stronger selection against X-linked than autosomal insertions (Montgomery, Charlesworth, and Langley 1987; Langley et al. 1988) and hence a deficit of elements on the X with respect to random expectations. This is expected to occur on both the deleterious insertional mutation and ectopic exchange models, although the quantitative predictions of relative equilibrium abundances are model dependent (Langley et al. 1988; Charlesworth, Lapid, and Canada 1992). We found a significant deficit of TEs on the X as compared with 2R: the X represents 52% of the DNA in the two chromosomes, but it only contains 42% of the elements (tables 2 and 3; x2 5 19.3, P , 0.001). This effect was evident for all three TE classes, although it only reached significant levels for non-LTR retroelements (x2 5 18.9, P , 0.001). The observed proportion of X-linked insertions was also compared with the theoretical expectations generated by four alternative models (Langley et al. 1988; table 7, these estimates were corrected by considering the relative size of the X with respect to 2R). Interestingly, the observed 42% of X-linked insertions fits the expectations under the two versions of the ectopic exchange model, particularly that from the less restrictive one (expected value: 41.3%), which assume unequal exchange between TEs located anywhere in the genome. In addition, it is significantly different from the predictions under the insertional mutation and neutral models (33.6% and 51.7%, respectively; P , 0.001 in a chisquare of goodness of fit, in both cases). Additional evidence for stronger selection on Xlinked insertions comes from the observation that the proportion of full-length copies on the X is twice that of those on 2R (table 1). Because the long-term persistence of an element insertion means that it is subject to processes such as DNA loss (Petrov et al. 2000), one might expect that the sizes of insertions would be negatively correlated with their age. The longer mean length of X-linked insertions is thus consistent with a higher rate of element turnover on the X because of stronger selection against insertions on this chromosome. However, given the uncertainty concerning the length estimates of some elements, this conclusion should be treated with caution.

933

TE Distribution and Gene Density Under the insertional mutation model, TEs are expected to be more abundant in regions with lower gene density because this is where they are less likely to insert into or near a gene. Assuming a positive association between gene density and recombination rate, this would explain the observed accumulation of elements in regions of low recombination. In contrast to this prediction, we have found that TEs accumulate in the fourth chromosome as well as in the reduced recombination regions of 2R and X, but gene density in these regions is not reduced as compared with that for the highly recombining segments (table 6). This observation suggests that, although gene density is associated with recombination rate, the build up in TE numbers in regions of low recombination is not necessarily related to gene density. This is in good agreement with the ectopic exchange model, which proposes that TEs accumulate where recombination is greatly reduced but does not predict any effect of gene density. However, there is one piece of evidence from the in situ data (Charlesworth, Lapid, and Canada 1992) that does not fit the predictions of the ectopic exchange model: the observed lack of accumulation of elements at the tip of the X, a region of greatly reduced crossing-over (Lefevre 1971; Padilla and Nash 1977; Kliman and Hey 1993; Sniegowski, Pringle, and Hughes 1994; see also Charlesworth 1996 for a summary of recombination rate data in this region). To investigate if this pattern was also detected in our data set, we analyzed separately the distribution of elements in the reduced recombination regions at the bases and tips of 2R and X. As shown in table 9, the observed accumulation of elements in the reduced recombination region of the X is exclusively caused by a significant build up in TE numbers in the basal portion, whereas the reduced recombination region at the tip contains 22 out of the 218 X-linked insertions, precisely the expected number under the hypothesis of a random distribution, considering the fraction of the chromosome it represents (0.12, table 3). This lack of accumulation at the tip also affects the short null recombination region between bands 1A5–1B4 (table 5). Interestingly, data from the tip of 2R suggests that this effect cannot be directly attributed to the high gene density detected in this region, given that the tip of 2R displays a comparatively high gene density and that this is not an obstacle to the significant accumulation of TEs. The most plausible explanation for this is that the rate of ectopic exchange is not reduced in the tip of the X, as has been found to be the case in yeast subtelomeric regions (Haber et al. 1991). At present, there is no independent support for this interpretation as far as Drosophila is concerned. The lack of accumulation of TEs is, of course, inconsistent with the interpretation of element accumulation in low recombination regions in terms of Hill-Robertson effects. We further note that the larger size of coding sequences and introns on chromosome 4 cannot be explained by enrichment in TE-derived DNA. There are no insertions into coding sequences, most TEs on chro-

934

Bartolome´ et al.

Table 9 TE Distributions and Gene Density in the Regions of Reduced Recombination at the Bases and Tips of Chromosomes 2R and X 2R

Fraction of region (%). . . . . . . . . . TEs . . . . . . . . . . . . . . . . . . . . . . . . . Fraction of coding DNA (%)a . . . SE . . . . . . . . . . . . . . . . . . . . . . . . . .

O E

X

Base

Tip

45.9 25 25.2 18.0 2.3

54.1 30 29.8 27.2 3.3

P NS

Base

Tip

41.3 55 31.8 11.3 2.2

58.7 22 45.2 22.9 2.3

P ***

NOTE.—SE: standard error; NS: not significant; *** P , 0.001 for x2 test. Average of the values obtained from each 50 kb interval in which the two regions were divided.

a

mosome 4 are inserted into intergenic regions, and the relative contribution of TEs to chromosome 4 introns is low relative to that of the much smaller introns in the null recombination region of 2R (4.9% vs. 11.1%, table 7). A negative correlation between intron size and recombination rate has been pointed out before (Comeron and Kreitman 2000; Clark 2001); our results suggest that this mainly reflects the large size of introns on chromosome 4. Large introns and coding sequences also seem to be characteristic of Y chromosomal genes in Drosophila (Reugels et al. 2000), another region of null recombination. The reasons for this are obscure. Comero´n and Kreitman (2000) suggest that it may reflect selection for an increase in recombination frequencies in regions where the rate of recombination per nucleotide is low by increasing the size of the genes in these regions. This seems unlikely to apply to chromosome 4 and the Y chromosome, where crossing-over is thought to be absent under normal conditions, although recent data on DNA sequence polymorphism suggest the occurrence of some recombinational exchange on chromosome 4 (Jensen, Charlesworth, and Kreitman 2002; Wang et al. 2002). Another possibility is that forces which lead to the accumulation of repetitive sequences are more effective in regions of low recombination (Charlesworth, Jarne, and Assimacopoulos 1994) so that these can accumulate within introns more easily. Similarly, the HillRobertson effect may mean that selection is less effective in opposing increases in gene length in regions of low recombination. Consistent with this, Akashi (1996) has shown that mean protein lengths in D. melanogaster are larger than in D. simulans, possibly reflecting the reduced effective population size of D. melanogaster. In addition, although TEs form a significant proportion of the intergenic DNA (tables 4 and 7), and the greater proportion of intergenic DNA in regions with low recombination in part reflects the greater abundance of TEs in these regions, the contribution of TEs to intergenic sequences is never a major one. In particular, the build up of TEs on chromosome 4 contrasts with the relative large genic contribution to this chromosome. The picture is very different from that for maize or humans, where TE-derived sequences form a major part of the intergenic regions (SanMiguel et al. 1998; Lander et al. 2001) and suggests that selection is much less pow-

erful in removing TE insertions in these species. Again, the reasons for these differences are obscure. TE Distributions in Other Species The patterns described previously contrast with the distribution of TEs in the C. elegans genome, where transposons are preferentially located in regions of high recombination (which correspond to regions of relatively low gene density), whereas the distribution of retroelements is independent of the local recombination rate (Duret, Marais, and Bie´mont 2000). This has been interpreted as evidence against the ectopic exchange model (Duret, Marais, and Bie´mont 2000). However, some population genetic considerations should be taken into account before definite conclusions can be drawn. C. elegans is an hermaphrodite species, which is likely to be highly self-fertilizing in nature, so that its evolutionarily effective recombination rate is effectively nearly zero (Nordborg 2000). This means that low recombination genomic regions will not be more subject to HillRobertson effects than the rest of the genome, although the genome as a whole may experience a reduced efficacy of selection because of its reduced effective recombination rate, compared with an otherwise similar outbreeding species (Charlesworth and Wright 2001). In addition, if ectopic exchange occurs primarily among heterozygous elements, as indicated by some experimental evidence (Montgomery et al. 1991), there will be no differences in its rate among regions with different recombination rates in a selfing species. Theoretical studies also show that the breeding system of the host plays an important role in TE dynamics (Charlesworth D and Charlesworth B 1995; Wright and Schoen 1999; Morgan 2001). In selfing populations, the higher levels of homozygosity are expected to substantially change the selective landscape: ectopic exchange is less likely to be a significant source of selection against insertions, so that elements would be relatively free to accumulate in regions where they do not cause strong deleterious insertional mutations. At the same time, deleterious mutations caused by TE insertions would be more effectively screened by selection because of their expression in the homozygous state, but this would not be affected by differences in the local recombination rate. This effect would be, at least partly, coun-

Distribution of TEs in Drosophila melanogaster

teracted by the increased Hill-Robertson effects referred to previously. A population survey of two plant species of the genus Arabidopsis is consistent with these theoretical predictions, suggesting a reduction in the efficacy of natural selection against element insertions in the highly selfing A. thaliana compared with its outbreeding relative A. petraea (Wright et al. 2001). In addition, a genomic analysis of A. thaliana has shown that TEs are more abundant in regions of low gene density, whereas the local rate of recombination plays little role in controlling their distribution (S. I. Wright, N. Agrawal, and T. E. Bureau, personal communication). In the C. elegans genome, gene density is lower in regions of high recombination, providing an explanation for the observed tendency of transposons to accumulate in these regions (Duret, Marais, and Bie´mont 2000).

935

ement (HB) in an intron of chromosome 4 has now been reported (Jensen, Charlesworth, and Kreitman 2002), so that it would also seem worth examining the sites where elements are found in the genome sequence for evidence of fixations or unusually high frequencies in regions of null or reduced recombination. Acknowledgments We thank F. Depaulis for successive discussions along the progress of this study and two anonymous reviewers for their comments that helped to improve the manuscript. C.B. was supported by an NSF grant to B.C. X.M. was supported by a Marie Curie Postdoctoral Fellowship, and B.C. is supported by the Royal Society. LITERATURE CITED

Conclusions Although our analysis of the genomic data of D. melanogaster is broadly compatible with selection against the deleterious effects of ectopic exchange being a major force in regulating element abundance, we cannot draw firm conclusions as to whether the relation between recombination and TE abundance in D. melanogaster reflects the consequences of differences in the rate of ectopic exchange or differences in the intensity of Hill-Robertson on deleterious insertional mutations. It is worth noting that although the theoretical basis for predicting the effects of ectopic exchange has been well studied (Langley et al. 1988; Charlesworth, Lapid, and Canada 1992), little work has been done on the HillRobertson model in relation to TEs, so that we currently have no firm knowledge of what to expect. The simulations of Charlesworth and Charlesworth (1983) showed only very small effects of differences in recombination rate or effective population sizes (or both) on the abundances of TE under the insertional mutation model, but these were very limited in scope. In any case, it seems unlikely that merely examining the distribution of TEs across a single genome can distinguish between Hill-Robertson effects and ectopic exchange. It is likely that only population surveys offer a prospect of doing this. As pointed out by Charlesworth, Lapid, and Canada (1992) in the specific context of the effects of hitch-hiking by favorable mutations on TE distributions, Hill-Robertson effects (which are formally similar to a reduction in effective population size) should result in reduction in the proportion of chromosomal sites where TEs are found to be segregating. This is compensated by a higher mean frequency of elements at sites where elements are present, including the likelihood of some fixations of elements (see fig. 4 of Charlesworth and Charlesworth 1983). The ectopic exchange model predicts the opposite to the first of these effects, and fixation is unlikely unless the effective population size has been greatly reduced by Hill-Robertson effects as well. The in situ data discussed by Charlesworth, Lapid, and Canada (1992) provided no support for HillRobertson effects, but it would clearly be desirable to reexamine this question. One case of fixation of an el-

ADAMS, M. D., S. E. CELNIKER, R. A. HOLT et al. (190 coauthors). 2000. The genome sequence of Drosophila melanogaster. Science 287:2185–2195. AKASHI, H. 1996. Molecular evolution between Drosophila melanogaster and D. simulans: reduced codon bias, faster rates of amino acid substitution, and larger proteins in D. melanogaster. Genetics 144:1297–1307. ANDOLFATTO, P., J. D. WALL, and M. KREITMAN. 1999. Unusual haplotype structure at the proximal breakpoint of In(2L)t in a natural population of Drosophila melanogaster. Genetics 153:1297–1311. ASHBURNER, M., ed. 1989. Drosophila. A laboratory handbook. Cold Spring Harbor Press, Cold Spring Harbor, New York. BENOS, P. V., M. K. GATT, L. MURPHY et al. (46 co-authors). 2001. From first base: the sequence of the tip of the X chromosome of Drosophila melanogaster, a comparison of two sequencing strategies. Genome Res. 11:710–730. BERG, D. E., and M. M. HOWE, ed. 1989. Mobile DNA. American Society for Microbiology, Washington D.C. BIE´MONT, C., F. LEMEUNIER, M. P. GARCIA GUERREIRO et al. 1994. Population dynamics of the copia, mdg1, mdg3, gypsy, and P transposable elements in a natural population of Drosophila melanogaster. Genet. Res. 63:197–212. BIE´MONT, C., A. TSITRONE, C. VIEIRA, and C. HOOGLAND. 1997a. Transposable element distribution in Drosophila. Genetics 147:1997–1999. BIE´MONT, C., C. VIEIRA, C. HOOGLAND et al. 1997b. Maintenance of transposable element copy number in natural populations of Drosophila melanogaster and D. simulans. Genetica 100:161–166. BOISSINOT, S., A. ENTEZAM, and A. V. FURANO. 2001. Selection against deleterious LINE-1–containing loci in the human lineage. Mol. Biol. Evol. 18:926–935. BROOKFIELD, J. F. 1991. Models of repression of transposition in P-M hybrid dysgenesis by P cytotype and by zygotically encoded repressor proteins. Genetics 128:471–486. ———. 1996. Models for the spread of non-autonomous selfish transposable elements when transposition and fitness are coupled. Genet. Res. 67:199–210. CA´CERES, M., J. M. RANZ, A. BARBADILLA et al. 1999. Generation of a widespread Drosophila inversion by a transposable element. Science 285:415–418. CAPY, P., ed. 1997. Evolution and impact of transposable elements. Kluwer, Dordrecht. CHARLESWORTH, B. 1996. Background selection and patterns of genetic diversity in Drosophila melanogaster. Genet. Res. 68:131–149.

936

Bartolome´ et al.

CHARLESWORTH, B., and D. CHARLESWORTH. 1983. The population dynamics of transposable elements. Genet. Res. 42: 1–27. CHARLESWORTH, B., P. JARNE, and S. ASSIMACOPOULOS. 1994. The distribution of transposable elements within and between chromosomes in a population of Drosophila melanogaster. III. Element abundances in heterochromatin. Genet. Res. 64:183–197. CHARLESWORTH, B., and C. H. LANGLEY. 1991. Population genetics of transposable elements in Drosophila. Pp. 150–76 in R. K. SELANDER, A. G. CLARK, and T. S. WHITTAM, eds. Evolution at the molecular level. Sinauer, Sunderland, Mass. CHARLESWORTH, B., C. H. LANGLEY, and P. D. SNIEGOWSKI. 1997. Transposable element distributions in Drosophila. Genetics 147:1993–1995. CHARLESWORTH, B., A. LAPID, and D. CANADA. 1992. The distribution of transposable elements within and between chromosomes in a population of Drosophila melanogaster. II. Inferences on the nature of selection against elements. Genet. Res. 60:115–130. CHARLESWORTH, B., P. D. SNIEGOWSKI, and W. STEPHAN. 1994. The evolutionary dynamics of repetitive DNA in eukaryotes. Nature 371:215–220. CHARLESWORTH, D., and B. CHARLESWORTH. 1995. Transposable elements in inbreeding and outbreeding populations. Genetics 140:415–417. CHARLESWORTH, D., and S. I. WRIGHT. 2001. Breeding systems and genome evolution. Curr. Opin. Genet. Dev. 11: 685–690. CLARK, A. G. 2001. The search for meaning in noncoding DNA. Genome Res. 11:1319–1320. COMERON, J. M., and M. KREITMAN. 2000. The correlation between intron length and recombination in Drosophila. Dynamic equilibrium between mutational and selective forces. Genetics 156:1175–1190. DURET, L., G. MARAIS, and C. BIE´MONT. 2000. Transposons but not retrotransposons are located preferentially in regions of high recombination rate in Caenorhabditis elegans. Genetics 156:1661–1669. EANES, W. F., C. WESLEY, and B. CHARLESWORTH. 1992. Accumulation of P elements in minority inversions in natural populations of Drosophila melanogaster. Genet. Res. 59:1–9. FLAVELL, R. B. 1986. Repetitive DNA and chromosome evolution in plants. Philos. Trans. R. Soc. Lond. B Biol. Sci. 312:227–242. GORDO, I., and B. CHARLESWORTH. 2001. Genetic linkage and molecular evolution. Curr. Biol. 11:R684–R686. GREEN, H., and N. WANG. 1994. Codon reiteration and the evolution of proteins. Proc. Natl. Acad. Sci. USA 91:4298– 4302. GUERREIRO, M. P., and A. FONTDEVILA. 2001. Chromosomal distribution of the transposable elements Osvaldo and blanco in original and colonizer populations of Drosophila buzzatii. Genet. Res. 77:227–238. HABER, J. E., W. Y. LEUNG, R. H. BORTS, and M. LICHTEN. 1991. The frequency of meiotic recombination in yeast is independent of the number and position of homologous donor sequences: implications for chromosome pairing. Proc. Natl. Acad. Sci. USA 88:1120–1124. HILL, W. G., and A. ROBERTSON. 1966. The effect of linkage on limits to artificial selection. Genet. Res. 8:269–294. HOCHMAN, B. 1976. The fourth chromosome of Drosophila melanogaster. Pp. 903–928 in M. ASHBURNER and E. NOVITSKI, eds. The genetics and biology of Drosophila. Academic Press, London.

HOOGLAND, C., and C. BIE´MONT. 1996. Chromosomal distribution of transposable elements in Drosophila melanogaster: test of the ectopic recombination model for maintenance of insertion site number. Genetics 144:197–204. JENSEN, M. A., B. CHARLESWORTH, and M. KREITMAN. 2002. Patterns of genetic variation at a chromosome four locus of Drosophila melanogaster and D. simulans. Genetics 160: 493–507. JUNAKOVIC, N., A. TERRINONI, C. DI FRANCO et al. 1998. Accumulation of transposable elements in the heterochromatin and on the Y chromosome of Drosophila simulans and Drosophila melanogaster. J. Mol. Evol. 46:661–668. KAPLAN, N. L., and J. F. Y. BROOKFIELD. 1983. The effect on homozygosity of selective differences between sites of transposable elements. Theor. Popul. Biol. 23:273–280. KLIMAN, R. M., and J. HEY. 1993. Reduced natural selection associated with low recombination in Drosophila melanogaster. Mol. Biol. Evol. 10:1239–1258. LANDER, E. S., L. M. LINTON, B. BIRREN et al. 2001. Initial sequencing and analysis of the human genome. Nature 409: 860–921. LANGLEY, C. H., E. MONTGOMERY, R. HUDSON et al. 1988. On the role of unequal exchange in the containment of transposable element copy number. Genet. Res. 52:223–235. LEFEVRE, G. JR. 1971. Salivary chromosome bands and the frequency of crossing over in Drosophila melanogaster. Genetics 67:497–513. LINDSLEY, D. L., and L. SANDLER. 1977. The genetic analysis of meiosis in female Drosophila melanogaster. Philos. Trans. R. Soc. Lond. B. Biol. Sci. 277:295–312. MANNING, J. E., C. W. SCHMID, and N. DAVIDSON. 1975. Interspersion of repetitive and nonrepetitive DNA sequences in the Drosophila melanogaster genome. Cell 4:141–155. MASIDE, X., S. ASSIMACOPOULOS, and B. CHARLESWORTH. 2000. Rates of movement of transposable elements on the second chromosome of Drosophila melanogaster. Genet. Res. 75:275–284. MASIDE, X., C. BARTOLOME, S. ASSIMACOPOULOS, and B. CHARLESWORTH. 2001. Rates of movement and distribution of transposable elements in Drosophila melanogaster: in situ hybridization versus Southern blotting data. Genet. Res. 78:121–136. MONTGOMERY, E., B. CHARLESWORTH, and C. H. LANGLEY. 1987. A test for the role of natural selection in the stabilization of transposable element copy number in a population of Drosophila melanogaster. Genet. Res. 49:31–41. MONTGOMERY, E. A., S. M. HUANG, C. H. LANGLEY, and B. H. JUDD. 1991. Chromosome rearrangement by ectopic recombination in Drosophila melanogaster: genome structure and evolution. Genetics 129:1085–1098. MORGAN, M. T. 2001. Transposable element number in mixed mating populations. Genet. Res. 77:261–275. MYERS, E. W., G. G. SUTTON, A. L. DELCHER et al. 2000. A whole-genome assembly of Drosophila. Science 287:2196– 2204. NEKRUTENKO, A., and W. H. LI. 2001. Transposable elements are found in a large number of human protein-coding genes. Trends Genet. 17:619–621. NORDBORG, M. 2000. Linkage disequilibrium, gene trees and selfing: an ancestral recombination graph with partial selffertilization. Genetics 154:923–929. NUZHDIN, S. V., and T. F. MACKAY. 1995. The genomic rate of transposable element movement in Drosophila melanogaster. Mol. Biol. Evol. 12:180–181. NUZHDIN, S. V., E. G. PASYUKOVA, and T. F. MACKAY. 1997. Accumulation of transposable elements in laboratory lines of Drosophila melanogaster. Genetica 100:167–175.

Distribution of TEs in Drosophila melanogaster

PADILLA, H. M., and W. G. NASH. 1977. A further characterization of the cinnamon gene in Drosophila melanogaster. Mol. Gen. Genet. 155:171–177. PETROV, D. A., T. A. SANGSTER, J. S. JOHNSTON et al. 2000. Evidence for DNA loss as a determinant of genome size. Science 287:1060–1062. PIMPINELLI, S., M. BERLOCO, L. FANTI et al. 1995. Transposable elements are stable structural components of Drosophila melanogaster heterochromatin. Proc. Natl. Acad. Sci. USA 92:3804–3808. REUGELS, A. M., R. KUREK, U. LAMMERMANN, and H. BUNEMANN. 2000. Mega-introns in the dynein gene DhDhc7(Y) on the heterochromatic Y chromosome give rise to the giant threads loops in primary spermatocytes of Drosophila hydei. Genetics 154:759–769. SANMIGUEL, P., B. S. GAUT, A. TIKHONOV et al. 1998. The paleontology of intergene retrotransposons of maize. Nat. Genet. 20:43–45. SNIEGOWSKI, P. D., and B. CHARLESWORTH. 1994. Transposable element numbers in cosmopolitan inversions from a natural population of Drosophila melanogater. Genetics 137:815– 827. SNIEGOWSKI, P. D., A. PRINGLE, and K. A. HUGHES. 1994. Effects of autosomal inversions on meiotic exchange in distal and proximal regions of the X chromosome in a natural population of Drosophila melanogaster. Genet. Res. 63:57– 62.

937

SPRADLING, A. C. 1994. Transposable elements and the evolution of heterochromatin. Soc. Gen. Physiol. Ser. 49:69– 83. STEINEMANN, M., and S. STEINEMANN. 1998. Enigma of Y chromosome degeneration: neo-Y and neo-X chromosomes of Drosophila miranda a model for sex chromosome evolution. Genetica 102–103:409–420. VIEIRA, C., and C. BIE´MONT. 1996. Selection against tranposable elements in D. simulans and D. melanogaster. Genet. Res. 68:9–15. WANG, W., K. THORNTON, A. J. BERRY, and M. LONG. 2002. Nucleotide variation along the Drosophila melanogaster fourth chromosome. Science 295:134–137. WRIGHT, S. I., Q. H. LE, D. J. SCHOEN, and T. E. BUREAU. 2001. Population dynamics of an Ac-like transposable element in self- and cross-pollinating Arabidopsis. Genetics 158:1279–1288. WRIGHT, S. I., and D. J. SCHOEN. 1999. Transposon dynamics and the breeding system. Genetica 107:139–148. YOUNG, M. W. 1979. Middle repetitive DNA: a fluid component of the Drosophila genome. Proc. Natl. Acad. Sci. USA 76:6274–6278.

THOMAS EICKBUSH, reviewing editor Accepted February 5, 2002

Suggest Documents