INSECT TRANSMITTED PLANT PATHOGENIC MOLLICUTES, SPIROPLASMA KUNKELII

INSECT TRANSMITTED PLANT PATHOGENIC MOLLICUTES, SPIROPLASMA KUNKELII AND ASTER YELLOWS WITCHES' BROOM PHYTOPLASMA: FROM STRUCTURAL GENOMICS TO FUNCTIO...
Author: Donald Knight
41 downloads 0 Views 5MB Size
INSECT TRANSMITTED PLANT PATHOGENIC MOLLICUTES, SPIROPLASMA KUNKELII AND ASTER YELLOWS WITCHES' BROOM PHYTOPLASMA: FROM STRUCTURAL GENOMICS TO FUNCTIONAL GENOMICS DISSERTATION

Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University By Xiaodong Bai, M.S. *****

The Ohio State University 2004

Dissertation Committee: Dr. Saskia A. Hogenhout, Adviser

Approved by

Dr. David L. Denlinger Dr. David M. Francis Dr. Parwinder S. Grewal

Adviser Department of Entomology

ABSTRACT

The mollicutes, Spiroplasma kunkelii and aster yellows witches' broom (AY-WB) phytoplasma, are insect-transmitted plant pathogens. These mollicutes invade and replicate in cells of various insect organs and tissues, and inhabit and replicate in plant phloem tissues. They cause severe symptoms to many plant species worldwide, including economically important crops and ornamental plants. Their fastidious nature and lack of genetic tools have hampered the research on these plant pathogenic mollicutes. I employed various approaches, including genome sequencing, comparative genomics, functional genomics, and conventional molecular techniques, to study the biology and pathogenicity mechanisms of S. kunkelii and AY-WB phytoplasma. The partial genome of S. kunkelii and the complete genome of AY-WB phytoplasma were sequenced. Genome annotation revealed the presence of multiple spiroplasma phage DNA sequences in S. kunkelii and many repetitive elements in both genomes, suggestive of frequent recombination events. The genome sequence data provide genetic basis for the study of the biology and pathogenicity mechanisms of these organisms. Whereas spiroplasmas and phytoplasmas are distantly related to each other, they share the plant and insect habitats. Therefore, they may share genes involved in insect

ii

transmission and plant pathogenicity that are missing from the animal and human pathogenic mycoplasmas. To test this hypothesis, comparative genome analysis among mollicutes was conducted, and resulted in the identification of four genes that are present in the genomes of all plant-pathogenic mollicutes sequenced so far, but missing from the mycoplasmas. Another gene within both genomes might have been derived by horizontal gene transfer between spiroplasmas and phytoplasmas. The observation of spiroplasma surface appendages prompted the search of genes involved in fimbriae or pili formation. Four traE gene homologs were identified as membrane-bound ATPases in S. kunkelii M2 strain. Two homologs were localized in S. kunkelii chromosome and two in plasmids. The presence of these homologs varied among S. kunkelii strains of different geographical locations. The expression of the genes was detected in culture medium and during infection of insects and plants. Adjacent sequences of traE homologs suggest the involvement of TraE in spiroplasma conjugation and subsequent recombination, and adhesion. The secreted proteins of AY-WB phytoplasma are likely to directly interact with host cell components. Hence, the AY-WB phytoplasma genome sequence was mined for potentially secreted proteins that were further characterized by high-throughput functional assays such as virus-based expression in Nicotiana benthamiana (tobacco) and Lycopersicon esculentum (tomato). The in planta assay resulted in the identification of 17 candidate effector proteins. The detailed functional characterization was focused on two phytoplasma proteins (A11 and A30) that have a nuclear localization signal (NLS), and therefore, may be imported into plant nuclei in an importin α-dependent manner. Plant

iii

localization study with the yellow fluorescent protein fusions of these two proteins revealed their localization in the plant nuclei and confirmed their dependence on plant importin α for nuclear transport. Transcripts corresponding to the phytoplasma proteins were detected in AY-WB phytoplasma-infected insects and plants by RT-PCR. Microarrays demonstrated that phytoplasma A11 protein affected the expression profiles of 53 tomato genes, including several transcription factors, indicating that phytoplasma A11 protein directly or indirectly interacts with these proteins. These data are supportive of the hypothesis that A11 is a bona fide effector protein involved in plant pathogenicity. In summary, the research described in this dissertation resulted in the identification of several mollicute genes that are potentially involved in insect transmission and plant pathogenicity. It demonstrated that the genome sequencing, comparative genomics, and functional genomics approaches allow efficient identification and characterization of such genes in bacterial genomes. The importance of the research lies in the application of high throughput bioinformatics, genomics and molecular approaches in the study of agriculturally important organisms for which little information, and molecular and diagnosis/detection tools are available. The described research and approaches might be useful for other pathogenic mollicutes that are recalcitrant to in vitro manipulation, detection and characterization, including the economically important mycoplasmas that impact human health and livestock industries.

iv

Dedicated to my parents, my brother and those I love

v

ACKNOWLEDGMENTS

I wish to thank my adviser, Dr. Saskia A. Hogenhout, for her intellectual support and encouragement that made the whole research and this dissertation possible, and for her continuous support of my career development. I thank my Student Advisory Committee members, Dr. David L. Denlinger, Dr. David M. Francis, and Dr. Parwinder S. Grewal, for their advice and support of my graduate study. I thank Dr. Sophien Kamoun for his brilliant ideas and continuous support of my research and for the stimulating discussions. I am grateful to those who helped me with various experiments and techniques during my research, especially Mr. Ian Holford for computer programming, Dr. ElDesouky Ammar and Dr. Tea Meulia for electron and confocal microscopy, Dr. Michael M. Goodin for protein localization in plants, Dr. David M. Francis and Ms. Jorunn Bos for microarray data analysis, Mr. Valdir Ribeiro Correa, Ms. Diane M. Hartzler, Ms. Angela D. Strock, Ms. Miaoying Tian and Ms. Diane M. Kinney for help with the PVX assays, Dr. Thirumala Kanneganti for assistance with virus-induced gene silencing experiments, Mr. Edgar Huitema, Mr. Mark W. Jones, and Dr. Margaret Redinbaugh for isotope usage, Ms. Kristen J. Willie, Ms. Janet McCormick, and Dr. Juliette Hanson for mouse antibody production. vi

This research is supported by The Ohio State University – Ohio Agricultural Research and Development Center (OARDC) Research Enhancement Competitive Grant Program, Ohio Plant Biotechnology Consortium (OPBC) and the AY-WB phytoplasma genome-sequencing project is supported by the United States Department of Agriculture / National Science Foundation (USDA/NSF) Microbial Genome Sequencing Program.

vii

VITA

Oct. 15, 1974................................................ Born - Daqing, P. R. China 1992-1996.................................................... B.S. Department of Biological Science and Technology, Zhejiang University, Hangzhou, P. R. China 1996-1999.................................................... M.S. Institute of Zoology, Chinese Academy of Sciences, Beijing, P. R. China 1999-2000.................................................... Researcher, Qingdao Yongsheng Guangyuan Corporation, Qingdao, P. R. China 2000-present ................................................ Graduate Research Associate, Department of Entomology, The Ohio State University, OH, USA HONORS AND AWARDS •

Department Fellowship, Department of Entomology, The Ohio State University – Agricultural Research and Development Center (OARDC), OH, U.S.A. 20032004



OARDC Director’s Fellowship, The Ohio State University, OH, U.S.A. 20002003



Research Grant from OARDC Research Enhancement Competitive Grant Program, The Ohio State University, OH, U.S.A. 2003-2004



American Phytopathological Society Foundation The Raymond G. Grogan Travel Award, Milwaukee, MI, U.S.A. 2002



Chinese Academy of Sciences Di’Ao scholarship, Beijing, P.R. China. 1999 viii

PUBLICATIONS Research publications 1. Xiaodong Bai, Tatiana Fazzolari, and Saskia A. Hogenhout. 2004. Identification and Characterization of Spiroplasma kunkelii traE genes. Gene 336(1), 81-91. 2. Xiaodong Bai, Jianhua Zhang, Ian R. Holford, and Saskia A. Hogenhout. 2004. Comparative genomics identifies genes shared by distantly related insect-transmitted plant pathogenic mollicutes. FEMS Microbiology Letters 235, 249-258. 3. Wencai Yang, Xiaodong Bai, Eileen Kabelka, Christina Eaton, Sophien Kamoun, Esther van der Knaap, and David Francis. 2004. Discovery of single nucleotide polymorphisms in Lycopersicon esculentum and mapping of fruit color QTL in elite populations. Molecular Breeding 14, 21-34. 4. El-Desouky Ammar, Dave Fulton, Xiaodong Bai, Tea Meulia and Saskia A. Hogenhout. 2003. An attachment tip and fimbriae-like structures in plant- and insectpathogenic spiroplasmas of the class Mollicutes. Archives of Microbiology 181(2), 97105. 5. Xiaodong Bai and Saskia A. Hogenhout. 2002. A genome sequence survey of the mollicute corn stunt spiroplasma Spiroplasma kunkelii. FEMS Microbiology Letters 210(1), 7-17. 6. Qiang Liu, Yan Ye, Xiaodong Bai, and Cui Ding. 2001. Genetic localization of the synergistic factor of Pseudaletia separata granulosis virus. Acta Entomologica Sinia 44(2), 148-154. 7. Xiaodong Bai And Cui Ding. 2000. Primary study of synergistic mechanism of Agrotis segetum nuclear polyhedrosis virus. Chinese Journal of Applied and Environmental Biology 6(1), 52-55. 8. Fumian Cui, Jiaji Shi, and Xiaodong Bai. 1998. Enzymatic synthesis of Cephradine. Wei Sheng Wu Xue Bao 38(4), 300-303. FIELDS OF STUDY Major Field: Entomology Specialties: Microbial Genomics, Plant-Microbe Interactions

ix

TABLE OF CONTENTS

Page Abstract.......................................................................................................................... ii Dedication...................................................................................................................... v Acknowledgments ......................................................................................................... vi Vita................................................................................................................................. viii List of Tables ................................................................................................................. xiii List of Figures................................................................................................................ xv List of Abbreviations ..................................................................................................... xvii Chapters: 1.

Insect transmitted plant pathogenic mollicutes: A literature review ................. 1 1.1 1.2 1.3 1.4 1.5

Introduction .............................................................................................. 1 Evolution and phylogeny ......................................................................... 3 Plant symptomology................................................................................. 5 Insect transmission ................................................................................... 7 Pathogenicity mechanisms ....................................................................... 11 1.5.1 Hormonal................................................................................. 11 1.5.2 Molecular................................................................................. 13 1.6 Structure ................................................................................................... 15 1.7 Movement................................................................................................. 18 1.8 Structural genomics.................................................................................. 20 1.9 Comparative genomics............................................................................. 23 1.10 Functional genomics ................................................................................ 26 1.11 Research objectives.................................................................................. 30 1.12 Reference.................................................................................................. 31 2.

A genome sequence survey of the mollicute corn stunt spiroplasma Spiroplasma kunkelii ........................................................................................ 47 2.1 2.2 2.3 2.4

Abstract .................................................................................................... 48 Introduction .............................................................................................. 49 Materials and Methods............................................................................. 50 Results and Discussion............................................................................. 53 x

2.5 2.6 3.

Complete genome sequences of aster yellows witches' broom (AYWB) phytoplasma and comparison with onion yellows phytoplasma ....................................................................................................... 76 3.1 3.2 3.3 3.4 3.5 3.6 3.7

4.

Abstract .................................................................................................... 77 Introduction .............................................................................................. 78 Materials and Methods............................................................................. 80 Results ...................................................................................................... 83 Discussion ................................................................................................ 95 Acknowledgments.................................................................................... 98 References ................................................................................................ 99

Identification and characterization of traE genes of Spiroplasma kunkelii............................................................................................................... 118 4.1 4.2 4.3 4.4 4.5 4.6 4.7

5.

Acknowledgments.................................................................................... 63 References ................................................................................................ 63

Abstract .................................................................................................... 119 Introduction .............................................................................................. 120 Materials and Methods............................................................................. 121 Results ...................................................................................................... 126 Discussion ................................................................................................ 134 Acknowledgments.................................................................................... 137 References ................................................................................................ 138

Comparative genomics identifies genes shared by distantly related insect-transmitted plant pathogenic mollicutes ................................................. 150 5.1 5.2 5.3 5.4 5.5 5.6 5.7

Abstract .................................................................................................... 151 Introduction .............................................................................................. 152 Materials and Methods............................................................................. 154 Results ...................................................................................................... 156 Discussion ................................................................................................ 161 Acknowledgments.................................................................................... 164 References ................................................................................................ 165

xi

6.

Functional genomics identifies phytoplasmas effector proteins ....................... 175 6.1 6.2 6.3 6.4 6.5 6.6 6.7

Abstract .................................................................................................... 176 Introduction .............................................................................................. 177 Materials and Methods............................................................................. 179 Results ...................................................................................................... 184 Discussion ................................................................................................ 190 Acknowledgments.................................................................................... 193 References ................................................................................................ 193

Bibliography .................................................................................................................. 206

xii

LIST OF TABLES Table

Page

1.1

Summary of completed mollicute genomes ...................................................... 45

2.1

Sequence tags with significant similarity (E-value ≤ 10-5) to spiroplasma virus SpV1 and S. citri putative virulence proteins ...................... 69

2.2

Sequence tags with significant similarities (E-value ≤ 10-5) to NCBI nr protein sequences................................................................................ 70

2.3

S. kunkelii sequence tags with similarity to rRNA genes.................................. 75

3.1

General features of the chromosomes of the aster yellows witches’ broom (AY-WB) phytoplasma and Onion yellows (OY) phytoplasma genomes........................................................................................ 105

3.2

Comparison of the COG categories of AY-WB phytoplasma with those of other mollicutes.................................................................................... 106

3.3

Summary of ABC transporter genes in AY-WB and OY phytoplasma genomes........................................................................................ 107

3.4

Summary of AY-WB phytoplasmas P-type ATPase......................................... 108

3.5

Summary of redundant (either complete or incomplete) genes in AY-WB phytoplasma genome........................................................................... 109

3.6

Summary of the secreted proteins identified in AY-WB phytoplasma genome ......................................................................................... 111

4.1

Basic features of traE genes in S. kunkelii M2 strain........................................ 142

5.1

Four AY-WB and S. kunkelii homologues that were absent from mycoprotdb consisting of the whole genome sequences of M. genitalium, M. pneumoniae, U. urealyticum, M. pulmonis, M. penetrans, and M. gallisepticum........................................................................ 169 xiii

5.2

Identities of AY-WB and S. kunkelii proteins that are more similar to each other than to proteins in mycoprotdb .................................................... 170

6.1

Summary of PVX assay of AY-WB phytoplasma candidate effector proteins ................................................................................................. 197

6.2

Genes that were up-regulated in PVX:A11 treated tomato plants comparing to PVX only treated tomato plants .................................................. 199

6.3

Genes that were down-regulated in PVX:A11 treated tomato plants comparing to PVX only treated tomato plants .................................................. 200

xiv

LIST OF FIGURES Figure

Page

1.1

Phylogeny of mollicutes based on 16S rDNA sequences ................................. 46

3.1

Maps of the AY-WB phytoplasma plasmids..................................................... 113

3.2

The summary of AY-WB phytoplasma genome encoded transporters and central metabolic pathways..................................................... 114

3.3

AY-WB phytoplasma genome contains more repetitive sequences, both tandem repeats and inverted repeats, than OY phytoplasma genome............................................................................................................... 115

3.4

The AY-WB phytoplasma genome contained many copies of transposases genes and derivatives.................................................................... 116

3.5

The AY-WB phytoplasma genome contained one copy of complete ATP-dependent DNA helicase and multiple copies of pseudogenes ................ 117

4.1

ClustalW alignment of the deduced protein sequences of traE1, traE2, traE3 and traE4 of S. kunkelii strain M2................................................ 144

4.2

Phylogenetic analyses of the TraE protein sequences from S. kunkelii M2 strain and other organisms............................................................. 145

4.3

Detection of traE sequences by Southern blot hybridization of digested genomic DNA of S. kunkelii strains M2, CS-2B, FL-80 and PU8-17 ........................................................................................................ 146

4.4

Detection of traE sequences on chromosomal and plasmid DNA of S. kunkelii strains M2, CS-2B, and PU8-17 ...................................................... 147

4.5

Detection of S. kunkelii spiralin gene and traE transcripts on Northern blots of size-separated total RNA samples ........................................ 148

xv

4.6

Genetic contexts of traE genes in the genome of S. kunkelii CR23x ....................................................................................................................... 149

5.1

Algorithms employed to extract proteins that are common between the insect-transmitted plant pathogens Aster Yellows Witches’ Broom (AY-WB) and Spiroplasma kunkelii but are absent from five Mycoplasma spp. and Ureaplasma urealyticum ........................................ 171

5.2

Graphical representation of comparative analysis results ................................. 172

5.3

Phylogenetic analyses of proteins that are present in insecttransmitted plant pathogenic AY-WB and S. kunkelii but absent from animal and human pathogenic mycoplasmas............................................ 173

5.4

Phylogenetic analysis for AtA (AAA type ATPase)......................................... 174

6.1

Mining of AY-WB genome sequences for putative phytoplasma effector proteins ................................................................................................. 201

6.2

Representative plant symptoms after toothpick inoculation of Nicotiana benthamiana leaves with transformed Agrobacterium tumefaciens GV3101 strain ............................................................................... 202

6.3

Laser-scanning confocal microscopy images demonstrating the subcellular localization of YFP fusions of AY-WB proteins with NLS upon agroinfiltration into N. benthamiana leaves .................................... 203

6.4

The transportation of YFP:A11 was dependent on N. benthamiana (Nb) importin α.................................................................................................. 204

6.5

AY-WB candidate effector proteins A11- and A30-encoding genes were expressed during AY-WB infection to aster plants .................................. 205

6.6

AY-WB phytoplasma candidate effector proteins A11 and A30 were expressed during the infection to insects .................................................. 205

xvi

ABBREVIATIONS

[α-32P]-dCTP, alpha phosphor-32 labeled deoxy-cytosine trisphosphate ABC, ATP binding cassette AP, alkaline phosphatase ATP, adenine trisphosphate ATPase, adenine tri-phosphatase AYP, aster yellows phytoplasmas AY-WB, aster yellows witches' broom BLAST, Basic Local Alignment Search Tool bp, base pairs CBPP, contagious bovine pleuropneumonia CBF, cmp binding factor CCPP, contagious caprine pleuropneumonia DNA, deoxyribonucleic acid EB, ethidium bromide EBI, European Bioinformatics Institute EMBOSS, European Molecular Biology Open Software Suite EST, expressed sequence tag GAMBIT, genomic analysis and mapping by in vitro transposition GC, guanine and cytosine GFP, green fluorescence protein GGPP, geranyl geranyl pyrophosphate HMM, hidden Markov model IAA, indole-3-acetic acid IgG, immunoglobulin G IPP, isopentenyl pyrophosphate IVET, in vivo expression technology

kb, kilobases MBSP, maize bushy stunt phytoplasma MLO, mycoplasma-like organism MRFV, Maize Rayado Fino Virus MVA, Mevalonic acid NCBI, National Center for Biotechnology Information NLS, nuclear localization signal NN, neural network ORF, open reading frame OY, onion yellows PAGE, polyacrylamide gel electrophoresis PAUP, phylogenetic analysis using parsimony PBS, phosphate-buffered saline PCR, polymerase chain reaction PDB, The Protein Data Bank PFGE, pulsed field gel electrophoresis PNPase, polynucleotide phosphorylase RADAR, Rapid Automatic Detection and Alignment of Repeats RNA, ribonucleic acid RT-PCR, reverse transcription polymerase chain reaction SDS, sodium dodecyl sulfate STM, signature-tagged mutagenesis TRV, tobacco rattle virus UDP, uridine diphosphate VIGS, virus-induced gene silencing YFP, yellow fluorescence protein

xvii

CHAPTER 1

Insect Transmitted Plant Pathogenic Mollicutes: A Literature Review

1.1 Introduction

The discovery of plant pathogenic mollicutes was associated with the study of plant diseases. Phytoplasmas were first detected in plants showing dwarfing and witches' broom symptoms, including mulberry dwarf, potato witches' broom, aster yellows, or paulownia witches' broom (Doi et al., 1967). Phytoplasmas were initially known as 'mycoplasma-like organisms' (MLOs) because they were similar in morphology and ultrastructure to mycoplasmas (Doi et al., 1967). The first spiroplasma, Spiroplasma citri, was described in 1973 (Saglio et al., 1973). Since then, many more spiroplasma and phytoplasma species have been identified and the list of plants that they infect has been continuously growing. While all phytoplasma species are plant pathogens, only three spiroplasma species have been so far identified as plant pathogens. These are the citrus stubborn agent S. citri

1

(Saglio et al., 1973), the corn stunt agent S. kunkelii (Whitcomb et al., 1986), and the periwinkle pathogen S. phoeniceum (Saillard et al., 1987). All other spiroplasma species are commensals, symbionts or pathogens of arthropods (Gasparich, 2002). From this point on, plant pathogenic spiroplasmas and phytoplasmas are referred to as plant pathogenic mollicutes. Plant pathogenic mollicutes induce severe diseases in many plant species, resulting in significant economic losses. Much is known about the transmission biology and ecology of plant pathogenic mollicutes. However, more has to be learned about the biology, physiology and pathogenicity mechanisms of plant pathogenic mollicutes, especially phytoplasmas because of their uncultivable nature. Fortunately, genome sequencing of plant pathogenic mollicutes (Bai and Hogenhout, 2002; Bai et al., 2004b; Liefting and Kirkpatrick, 2003; Zhao et al., 2003, 2004a, 2004b; Oshima et al., 2004; Bai et al., in preparation) and the development of genetic tools for spiroplasmas (Foissac et al., 1997b; Gaurivaud et al., 2000a, 2000b, 2001; Lartigue et al., 2002) greatly facilitated the study of these pathogens. The purpose of this literature review is to present a comprehensive overview of the biology of plant pathogenic mollicutes. It includes the following sections: 1) Evolution and phylogeny; 2) Plant symptomology; 3) Insect transmission; 4) Pathogenicity mechanisms; 5) Structure; 6) Movement; 7) Structural genomics; 8) Comparative genomics; and 9) Functional genomics. It ends with the research objectives and the summary of the organization of the dissertation.

2

1.2 Evolution and phylogeny

Spiroplasmas and phytoplasmas belong to the Class Mollicutes that is a group of unique bacterial organisms, characterized by small genome sizes (580 – 2,200 kb), low GC content (23% – 40%), and the lack of cell wall (Razin et al., 1998). Based on 16S rDNA phylogeny, mollicutes were derived from a Gram-positive bacterial ancestor in the Clostridium linkage (Woese, 1987; Weisburg et al., 1989). The mollicutes evolved from the Gram-positive ancestor by degenerative or reductive evolution (Razin et al., 1998; Oshima et al., 2004). Mollicutes are divided into two major branches (Fig. 1.1): The AAA branch with the Asteroleplasma, Anaeroplasma, and Acholeplasma species the phytoplasmas that appear most closely related to the acholeplasmas; and the SEM branch with the Spiroplasma, Entomoplasma, Mesoplasma, Mycoplasma, and Ureaplasma species (Maniloff, 1996; Razin et al., 1998). Unlike members of the AAA branch, all members of the SEM branch treat UGA as a tryptophan rather than a stop codon (Maniloff, 1996; Razin et al., 1998). The use of UGA as a tryptophan codon has complicated in vitro protein expression of spiroplasmas and mycoplasmas. Spiroplasma and phytoplasmas are distantly related within the Class Mollicutes. Spiroplasmas are closer to mycoplasmas that are notorious animal and human pathogens. Spiroplasmas were considered early mollicutes because they generally have larger genomes than the mycoplasmas and phytoplasmas. Phytoplasmas were placed as a distinct monophyletic clade within the Class Mollicutes (Lim and Sears, 1992). However, because they cannot be cultured, they have not been assigned formal species names.

3

Trivia names, usually associated with the diseases they cause, have been used to describe the phytoplasma species. Until recently, the IRPCM (International Research Programme on Comparative Mycoplasmology) Phytoplasma/Spiroplasma Working Team – Phytoplasma Taxonomy Group proposed to accommodate phytoplasmas within the novel genus 'Candidatus (Ca.) Phytoplasmas' (IRPCM Phytoplasma/Spiroplasma Working Team – Phytoplasma Taxonomy Group, 2004). The team also assigned 'Ca. Phytoplasma' names to different phytoplasma groups. For instance, the aster yellows phytoplasma group is now 'Ca. phytoplasma asteris' (Lee et al., 2004). The phytoplasma phylogeny based on 16S rRNA sequences is widely used and mostly accepted as the standard (Razin et al., 1998). Additional sequence information, including the conserved ribosomal protein genes (Gundersen et al., 1996), the elongation factor EF-Tu (tuf) gene (Kamla et al., 1996; Schneider et al., 1997), the heat shock protein gene hsp70 (Falah and Gupta, 1997), and the 16S/23S rRNA intergenic sequences (Smart et al., 1996), have confirmed the phytoplasma phylogeny based on 16S rDNA sequences. It is worth mentioning the unique taxonomic status of the Mycoplasma mycoides cluster within the Class Mollicutes. The M. mycoides cluster consists of six closelyrelated species, two of which are of particular importance, M. mycoides subsp. mycoides SC causing contagious bovine pleuropneumonia (CBPP) and M. capricolum subsp. capripneumoniae causing contagious caprine pleuropneumonia (CCPP) (Thiaucourt et al., 2000). Based on the 16S rRNA phylogeny, the mycoplasma species in the M. mycoides cluster were closer to spiroplasmas than to the mycoplasma species in hominis

4

and pneumoniae groups (Weisburg et al., 1989). Recently, the M. mycoides cluster was shown to arise from spiroplasmas through an intermediate group of non-helical spiroplasmal descendants, the entomoplasmas (Gasparich et al., 2004). This suggested that the genus name of Mycoplasma is no longer suitable for the species within M. mycoides cluster. However, because of the practical important of the species within this group, the reclassification would have immense practical implications in diagnostic human and veterinary medicine (Razin et al., 1998; Gasparich et al., 2004). Therefore, the current phylogenetic classification remained. The phylogenetic status of M. mycoides cluster has to be taken into account in the comparative genomic sequence analysis of M. mycoides subsp. mycoides SC (small colony) complete genome (Westberg et al., 2004) with S. kunkelii (http://www.genome.ou.edu/spiro.html) and aster yellows witches' broom (AY-WB) phytoplasma gapped genome described in the dissertation (Chapter 5).

1.3 Plant symptomology

Plant pathogenic mollicutes have distinct plant host ranges. Spiroplasmas have a relatively narrow or restricted plant host range. They naturally infect several plant species including citrus, corn and periwinkle. On the other hand, phytoplasmas have a very wide plant host range, including more than 700 plant species worldwide. The plants susceptible to phytoplasmas include economically important vegetable crops, such as lettuce, carrot, and celery, and ornamental plants, such as China aster and purple coneflower. Interestingly, S. kunkelii and maize bushy stunt phytoplasma (MBSP) share the same

5

insect and plant host ranges. In fact, these two organisms, along with Maize Rayado Fino Virus (MRFV), were often found together in diseased maize plants and hence were called the corn stunt complex (Henriquez et al., 1996), which was transmitted by the leafhopper species of the genus Dalbulus (Henriquez et al., 1996; Hruska and Gomez-Peralta, 1997). Plant pathogenic mollicutes are mainly restricted to phloem tissues of plants. The rich content of glucose and fructose within the sieve tube of the phloem tissues provides the energy supply for plant pathogenic mollicutes (André et al., 2003). But it is also a great challenge for these organisms as the high concentrations of sugars in phloem sap might cause osmotic stresses to the bacteria (Purcell and Nault, 1991). Studies of S. citri indicated that the pathogenic organisms multiply in plant hosts and translocate toward meristems, storage organs, fruits, and other parts of plants involved in photosynthesis (Gussie et al., 1995). Plant pathogenic mollicutes replicate within the phloem tissues to a high titer. Plant pathogenic mollicutes induce severe symptoms of their plant hosts. Plant pathogenic spiroplasmas induce chlorosis, stunting with shortened length of internodes, proliferation of ears that do not mature, and reddening. In contrast, phytoplasmas cause more varied and severe symptoms. For instance, members of the aster yellows phytoplasma (AYP) group, the largest group in the genus Candidatus Phytoplasma, induce stunting and twisting, reddened or yellowish foliage, and sterile plants (Kirkpatrick, 1989). Floral parts that are normally brightly colored may remain green, and petals and sepals may become puckered and distorted. Secondary flower heads may emerge from the primary flower head, become leafy, and change in color. Losses of

6

certain ornamental plantings may range from 10 to 70%. Members of the AYP group cause dramatic yield losses of lettuce, ranging from 60 to 80% in leaf lettuces growing area in Ohio in a certain year (Hoy et al., 1992). Symptoms caused by AYP in lettuce include premature bolting, elongation of internodes and petioles, vein clearing, development of axiliary shoots (witches' broom), phyllody (transformation of flowers into leaves), virescence (greening of normally white tissue), chlorosis, necrosis, and rosetting (Murral, 1994; Lee et al., 2000). Symptoms may vary depending on the strains, time of infection, plant species, temperature, age and size of the plant host. Plants cannot be cured once infected with plant pathogenic mollicutes.

1.4 Insect transmission

Plant pathogenic mollicutes are all transmitted by insect vectors, mainly phloemfeeding leafhoppers of the family Cicadellidae, planthoppers of the family Cixiidae, and psyllids of the family Psylloidea (Harris, 1979; Tsai, 1979). Most spiroplasma species are associated with insects but are not plant pathogens, leading to the hypothesis that plant pathogenic mollicutes evolved from insect-inhabiting organisms (Seemüller et al., 2002). Feeding of the infested insects on plants resultd in the transmission of the mollicutes to plants and the mollicutes might have become plant pathogens later in evolution (Hackett and Clark, 1989). This hypothesis seems to be supported by the fact that some insect pathogenic spiroplasmas, such as honeybee pathogens, S. apis and S. melliferum, have a transmission route via plant nectar (Hackett and Clark, 1989).

7

Plant pathogenic mollicutes are transmitted by insect vectors in a persistent propagative manner (Nault, 1997; Purcell, 1982). The insects acquire the plant pathogenic mollicute from the phloem tissue when feeding on diseased plants. The plant pathogenic mollicutes penetrate the cell wall of the midgut portion of the intestinal tract of the insect, and thereafter, move into and multiply in the insect hemolymph and various insect organs. They can infect the Malpighian tubules, muscle cells and nerve cells. The mollicutes also invade the salivary gland cells, and are introduced, with the saliva, into the phloem tissues of healthy plants during feeding (Seemüller et al., 2002). The infection cycle of plant pathogenic mollicutes involves insects, plants, and bacteria, which provides an interesting model system to study the interaction and co-evolution of these three organisms. The interaction of the mollicute with the insects seems highly specific (Seemüller et al., 2002). For transmission to occur, the plant pathogenic mollicutes have to penetrate the gut and salivary gland barriers in insects. Failure to penetrate either of the barriers resulted in failure of the entire transmission process (Fletcher et al., 1998; Foissac et al., 1997b; Yu et al., 2000). Although the molecular mechanism of insect transmission is not yet understood, there is evidence of the involvement of specific spiroplasma attachment structures. S. kunkelii apparently forms fimbriae- and pili-like structures that may be involved in attachment and virulence in insects (Özbek et al., 2003; Ammar et al., 2004). Further, S. kunkelii have tip structures that may be important for penetration of epithelial cells in the midgut of the leafhopper Dalbulus elimatus, as the tip structures are associated with cup-shaped invaginations of midgut cell cells (Ammar et al., 2004).

8

Molecular and biochemical studies of S. citri resulted in the identification of an attachment protein, P89, that was associated with attachment to insect cells of S. citri (Fletcher et al., 1998; Yu et al., 2000). It was later renamed as SARP1 and characterized to have a novel domain designated sarpin (Berg et al., 2001). A homolog of SARP1 in S. kunkelii was also identified by sequence similarity searches (Bai et al., 2004a). Recently, a solute-binding protein of an ABC transporter from S. citri was demonstrated to be involved in insect transmission by Tn4001 transposon mutagenesis and functional complementation (Boutareaud et al., 2004). This putative lipoprotein, Sc76, belongs to the ABC transport family of S1_b in the ABCdb database. Deletion of the gene resulted in a 30-fold reduction of the transmission efficiency by the leafhopper vector (Boutareaud et al., 2004). It remains unclear which stage of the insect transmission the protein is involved in. Most plant pathogenic mollicutes are not transmissible transovarially (Razin et al., 1998). However, recent PCR and electron microscopy studies revealed the presence of AY-type phytoplasmas in the genital organs or eggs of the leafhoppers of Scaphoideus titanus and Hishimonoides sellatiformis (Alma et al., 1997; Kawakita et al., 2000). There has not been any indication of transovarial transmission of plant pathogenic spiroplasmas. However, several other Spiroplasma spp., such as S. poulsonii, are transmitted to the next generation of Drosophila and pea aphid, Acyrthosiphon pisum (Bové, 1997; Fukatsu et al., 2001; Anbutsu and Fukatsu, 2003). Plant pathogenic mollicutes can affect their insect vector in various ways. Xdisease phytoplasmas can reduce the lifespan of the infected leafhopper Colladonus

9

montanus to a half (Jensen, 1959). MBSP infected leafhopper species produced fewer offsprings than healthy ones (Nault et al., 1984). Several Dalbulus species that were poor vectors of the corn stunt agent S. kunkelii were affected adversely by S. kunkelii infection (Madden and Nault, 1983). On the other hand, S. kunkelii is not pathogenic to its primary vector, Dalbulus maidis. In fact, the infection of S. kunkelii improved the overwintering ability of D. maidis (Ebbert and Nault, 1994). Another beneficial effect was observed during the interaction of the aster leafhopper, Macrosteles quadrilineatus, and AY phytoplasmas. The leafhoppers exposed to AY phytoplasma-infected plants lived longer and laid more eggs than non-exposed leafhoppers (Beanland et al., 2000). Also, AY phytoplasma infection of D. maidis increased its survival rate on a non-host plant, aster (Purcell, 1988). These mutual beneficial effects can be explained by the prolonged association and co-evolution of the pathogen and the insect vector (Beanland et al., 2000), and such a relationship aids the dispersion and survival of both the plant pathogenic mollicutes and the insect vectors. The nature of the insect vectors directly affects the plant host range of plant pathogenic mollicutes (Seemüller et al., 2002). C. tenellus and D. maidis are effective vectors that can transmit both spiroplasmas and phytoplasmas (Chiykowski and Sinha 1990). The AY phytoplasmas can be transmitted by more than 30 polyphagous leafhopper vectors to more than 300 plant species belonging to 45 families (Lee et al., 2000). Thus, the AY phytoplasmas have low vector specificity.

10

1.5 Pathogenicity mechanisms

The lack of genetic tools and the fastidious nature of plant pathogenic mollicutes have hampered the research progress. S. citri, the type species of the genus Spiroplasma, has a shorter generation time than S. kunkelii, and became the primary focus of the research on spiroplasma pathogenicity. Research on molecular pathogenicity of the uncultivable phytoplasmas becomes possible only following the recent accumulation of genome sequence data. The attachment, penetration and multiplication during insect transmission and possible involvement of mollicute structures in pathogenicity are discussed in "insect transmission" and "structure" sections, respectively. This section focuses on the mechanisms of plant pathogenic mollicutes causing diseases to plant hosts.

1.5.1 Hormonal In previous studies, certain factors were reported to be associated with the development of diseases caused by plant pathogenic mollicutes (Daniels, 1983; Gabridge et al., 1985). Factors associated with phytoplasma infection include the following: impairment of phloem function, alteration of turnover rates of hormones, low level of indole-3-acetic acid (IAA) oxidase activity, increase or depletion of hormone precursors, presence of inhibitors of hormone synthesis, transport or translocation, and selective uptake and transport of hormones (Daniels, 1979). For S. citri, the factors are: toxins with molecular weights up to 400 (Daniels, 1979), accumulation of lactic acid (Saglio et al., 1973), and proteolysis and arginine aminopeptidase enzymatic activity (Chang, 1998).

11

Furthermore, imbalances in hormone levels have been proposed to contribute to plant symptoms (Chang and Lee, 1995). Some attempts have been made to evaluate the contribution of mollicute interference with plant hormone levels to mollicute pathogenicity (Chang, 1998). The interesting findings shed light on the pathogenicity of plant pathogenic mollicutes. Phytoplasmas infection may alter endogenous auxin levels in plants. The biosynthesis of auxin is centered in young leaves and shoot tips, and the auxin product are transported in the phloem. Phytoplasmas infection in phloem may block the transport of auxin and reduce the concentration of the endogenous auxin (Chang, 1998). Chlorosis in leaves of spiroplasmas- and phytoplasmas-infected plants is due to the pigment alterations. In the infected plants, chlorophyll a, chlorophyll b, and total chlorophyll contents are significantly lower than those in the healthy plants. The yellowing of leaves begins with the older leaves, suggesting that the reduction in chlorophyll is due to the destruction of these pigments in mature chloroplasts. Also in leaf tissues, plant pathogenic mollicutes infection causes dramatic decreases of carotenoid and anthocyanin levels several weeks after the infection (Chang, 1998). Mevalonic acid (MVA), isopentenyl pyrophosphate (IPP), geranyl pyrophosphate, and farnesyl pyrophosphate are key intermediates in biosynthesis of sterol. Spiroplasmas in plants depend on plant sterols for growth (Chang, 1989). Thus, spiroplasma and phytoplasma infection results in more consumption of these materials than in healthy plants. IPP is also a precursor for biosynthesis of cytokinin. As a result, the competition for IPP would upset the balance of cytokinin in infected plants. MVA is a precursor for

12

both IAA and IPP. The constant demand for IPP for sterols would also change the balance of IAA (Chang, 1998). MVA, IPP, and geranyl geranyl pyrophosphate (GGPP) are intermediates in biosynthesis of carotenoids, gibberellins and chlorophyll. Since IPP is used for synthesis of sterols for the growth of plant, spiroplasmas and phytoplasmas, there could a shortage of IPP for cytokinin, gibberellin, chlorophyll and carotenoid biosynthesis in infected plants (Chang, 1998).

1.5.2 Molecular The development of the genetic tools, such as Tn4001 mutagenesis (Foissac et al., 1997b) and pBOT1 plasmid transformation of S. citri (Renaudin et al., 1995), has greatly facilitated the discovery of pathogenicity-related genes (Renaudin, 2002) and has resulted in the discovery of the involvement of the fructose operon in spiroplasma pathogenicity. The random Tn4001 insertion in S. citri genome produced a mutant GMT553, showing a delay in the symptom appearance (Foissac et al., 1997b). Subsequent analysis localized the Tn4001 insertion at the 5' end of the first gene of the fructose operon, fruR (Gaurivaud et al., 2000b), which encoded an activator protein of the fructose operon (Gaurivaud et al., 2001). Further characterization showed that the first three genes of the fructose operon, fruR, fruA, and fruK, were all disrupted in the mutant GMT533 and the mutant was unable to utilize fructose as a carbon or energy source (Gaurivaud et al., 2000b). The pathogenicity and the fructose utilization were restored by pBOT-derived plasmids carrying certain combinations of the three genes (Gaurivaud et al., 2000a).

13

However, the original mutant GMT553 could naturally revert to the wild-type phenotype after several generations (Foissac et al., 1997b). Later, new and more stable mutants were obtained, showing phenotypes similar to GMT553 (Gaurivaud et al., 2000a). A hypothesis was proposed to explain the involvement of fructose utilization in pathogenicity (Bové et al., 2003). Fructose is not abundant in plant phloem tissue. But it is needed, along with UDP-glucose, by the companion cell for loading sucrose, the major soluble carbohydrate in plants. Spiroplasmas utilized fructose, thus competing with the companion cells for fructose. The depletion of fructose results in less active companion cells, which in turn causes modified distribution of photo-biosynthesis products, accumulation of carbohydrates in "source" leaves, and depletion of carbohydrates in "sink" tissues (Bové et al., 2003). Consequently, the low level of carbohydrates in "sink" tissues leads to growth impairment, whereas the high level of carbohydrates in "source" tissues leads to chlorosis (Geigenberger et al., 1996). This hypothesis was supported by some observations (Braun and Sinclair, 1978; Catlin et al., 1975; Lepka et al., 1999), resulting in a novel mechanism in which sugar metabolism of pathogens interferes with plant physiology (Bové et al., 2003). However, other mechanisms could also be involved since fructose operon-disrupted S. citri was still pathogenic (Bové et al., 2003). In light of these pioneered research on spiroplasmas, the attempt of elucidating phytoplasma pathogenicity mechanisms and the availability of genome sequences of phytoplasmas prompted the functional genomics research described in the "Functional genomics" section in this introduction, and in more details, Chapter 6 "Functional genomics identify phytoplasma effector proteins".

14

1.6 Structure

Plant pathogenic mollicutes are unique bacteria that do not have cell wall, which make them naturally resistant to antibiotics inhibiting bacteria cell wall synthesis. Spiroplasmas are also not sensitive to another antibiotic, rifampicin (Chastel and Humphery-Smith, 1991). Spiroplasmas have a unique helical cell shape, as implied by the name 'spiro-'. Before spiroplasmas could be cultured in artificial media, they were thought to be spirochetes because of their morphological resemblance. In culture media, spiroplasmas assume a helical morphology, while in insect hosts, spiroplasma morphology may vary from oval, spherical to helical forms (Özbek et al., 2003). The mreB gene, which has a demonstrated role in rod shape determination of Escherichia coli (Doi et al., 1988), was also identified in S. citri (Bové et al., 2003) and S. kunkelii (Bai et al., 2004a). The mreB genes are also present in filamentous and helical bacteria, but not in round, spherical bacteria or the pleiomorphic mollicutes (Bové et al., 2003). The encoded MreB proteins are components of prokaryotic forms of actin filaments and form filamentous helical cytoskeleton-like structures lying close to the cell surface, involved in cell-shape determination (Jones et al., 2001; van den Ent et al., 2001). In most, if not all spiroplasmas, spiralin is the most abundant membrane protein and the major surface antigen (Bové et al., 2003). The nucleotide sequences of the spiralin genes were sequenced in several spiroplasma species (Chevalier et al., 1990; Foissac et al., 1997a). The deduced spiralin proteins contain a general amphiphilic

15

character and possess a conserved lipoprotein signal peptide (Foissac et al., 1997a; Bové et al., 2003), suggesting the extracellular localization of these proteins. A 'carpet model' has been proposed to explain the spiralin organization at the spiroplasmas cell surface. In this model, spiralin exhibits two colinear domains and anchors into the outer side of the lipid bilayer with the N-terminal lipid moiety (Castano et al., 2002). However, the extracellular localization of spiralin has not been demonstrated yet. Because of their surface localization, spiralin proteins might play a role in the interaction with plant hosts and/or insect vectors. Recently, a successful trial introduced the translation fusion of spiralin and GFP (green fluorescent protein) into S. citri using an oriC-based targeting vector, pC55 (Lartigue et al., 2002; Duret et al., 2003). The plasmid, containing the fusion protein, integrated into S. citri chromosome by a single-crossover recombination at the spiralin gene. Consequently, spiralin-GFP fusion protein would be produced and fluoresce. One mutant with disrupted spiralin gene expression could still multiply to a high titer in plants and produce the typical symptoms. However, the transmission efficiency of this mutant was 100 times lower than wild type (Duret et al., 2003). Thus, the involvement of spiralin in pathogenicity was excluded. Spiralin is needed for insect transmission of spiroplasmas; however, the precise mechanism remains to be determined (Duret et al., 2003). During the insect transmission process, fimbriae- and pili-like structures (Özbek et al., 2003) and "tip structure" (Ammar et al., 2004) were observed within insect cells and gut lumen, respectively. In order to identify the proteins responsible for the formation of the structures, a study was performed focusing on homologs of transfer proteins (Bai et

16

al., 2004a). Four traE homologs potentially involved in bacteria conjugation were identified from S. kunkelii CR2-3x gapped genome obtained from a public domain. The same homologs from S. kunkelii M2 strain were cloned and sequenced, showing a 100% match to the sequences from the CR2-3x strain. In silico studies revealed multiple features of these traE homologs, including the presence of transmembrane domains, ATPase domains, etc. The presence of these sequences in different strains from different geographical locations showed variations according to the geographical isolation. One of the four homologs was expressed during infection of insects and plants, and two homologs appeared to have shorter transcripts than the predicted open reading frames (ORFs). It was suspected that the shorter transcripts might have regulatory functions. Using pulsed field gel electrophoresis (PFGE) and Southern blot hybridization, two of the homologs were localized on the chromosome and the other two on plasmids. Interestingly, all of the ORFs of the four traE homologs localized in a region of important genes in the genome of the CR2-3x strain. The traE2 ORF (2.5 kb) had a transcript of approximately 10 kb, suggesting that it is part of an operon, including mreB, the cell shape determination gene (Jones et al., 2001; van den Ent et al., 2001). The traE3 and traE4 genes are adjacent to the p89 gene, which is involved in attachment (Fletcher et al., 1998; Yu et al., 2000; Berg et al., 2001). This work provides the basic knowledge for further research to determine whether the traE genes are involved in adhesion and/or conjugation (Bai et al., 2004a).

17

1.7 Movement

Spiroplasmas have a unique style of movement. Spiroplasmas do not have flagella, but have internal cytoskeletons and are motile (Trachtenberg, 2004). Spiroplasma internal cytoskeleton is a flat and membrane-bound ribbon composed of parallel fibrils. The fibril ribbon binds to the inner side of the spiroplasma cytoplasmic membrane, follows the shortest helical line and extends the entire length of the helix (Charbonneau and Ghiorse, 1984; Trachtenberg et al., 2003; Williamson et al., 1984). The structural unit of the contractile cytoskeleton is a filament comprised of pairs of a 59kDa fib (fibril) gene product (Trachtenberg, 2004; Williamson et al., 1991), while the functional unit of the contractile cytoskeletal ribbon is a fibril comprised of an aligned pair of filaments (Trachtenberg, 2004). The internal cytoskeletons act as a linear motor enabling and controlling the dynamic helicity (Trachtenberg et al., 2003). The dynamics of the elastic fibril filaments, coupled with energy producing biochemical reactions, such as ATP hydrolysis, propagates deformations that will generate propulsive forces to drive the swimming movement of the helical spiroplasmas (Berg, 2002; Gilad et al., 2003; Wolgemuth et al., 2003). In addition to the fib gene, another gene from S. citri, scm1, is involved in the motility mechanism (Jacob et al., 1997). The scm1-disrupted mutant generated by Tn4001 insertion mutagenesis form non-diffuse, sharp-edged colonies in contrast to fuzzy colonies of the wild type bacteria, indicating the loss of motility (Jacob et al., 1997). The two spiroplasma cytoskeletal genes, fib and scm1, have no homologs in

18

eukaryotes and other prokaryotes (Trachtenberg, 2004), which adds more weight to the uniqueness of spiroplasmas. Usually, spiroplasmas exhibit a random walking motility pattern with a relatively constant flexing frequency (Trachtenberg, 1998), which was referred to as "swimming". The swimming velocity increases with medium viscosity (Daniels et al., 1980). In the presence of certain attractive chemicals, spiroplasmas, as active swimmers, exhibit a straight-line pattern with a concomitant reduction in flexing frequency (Trachtenberg, 1998). The attractants include D-fructose, D-glucose, D-maltose, sucrose, L-alanine, Laspartate, L-arginine, L-cysteine, L-glutamate, glycine, L-methionine, L-serine, etc. Certain chemicals repel spiroplasmas, including L-histidine, L-leucine, L-phenylalanine, L-proline, L-valine and lactic acid (Daniels et al., 1980; Trachtenberg, 1998).

In contrast to the swimming movement of spiroplasmas, their close relatives, mycoplasmas, have a distinct gliding movement (Miyata et al., 2002; Wolgemuth et al., 2003). Mycoplasmas are polar cells and are near spherical in shape, having a "tip structure" at the leading end (Trachtenberg, 1998). Several mycoplasma species are known to glide in the direction of the "tip-structure". The underlying mechanism was unknown until the recent identification of a Gli349 protein responsible for the cytadherence and glass binding of M. mobile (Uenoyama et al., 2004). A spike structure was observed to protrude from mycoplasma membrane and attach to the glass surface using a rapid-freeze-and-fracture electron microscopy technique during M. mobile gliding (Miyata and Petersen, 2004). However, it is not clear whether Gli349 is related to the formation of the spike structure. The gliding movement of mycoplasmas is usually slow,

19

but the clear attraction to chemicals was observed. Mycoplasmas are attracted to Dfructose, D-glucose, D-lactose, D-maltose, D-sucrose, L-arginine, and L-asparagine (Kirchhoff, 1992).

1.8 Structural genomics

The release of the first complete genome sequences of Haemophilus influenzae (Fleischmann et al., 1995) started a brand-new genomics era, in which whole genome sequencing, comparative genomics, and functional genomics gradually overshadowed the traditional research methods focusing on one or several genes. The complete microbial genome sequence provides all genetic information about the microbe, enables highthroughput data-mining and analysis, forms the basis for bacterial phylogeny and taxonomy, and overall, bring the microbiological research to a new high level. Bacterial genome sequence data has been accumulating at a fast pace. The first large-scale genome-sequencing project, initiated in 1990 by the Harvard Genome Lab in collaboration with Heidelberg European Molecular Biology Laboratory, sequenced only 214 kb of the M. capricolum genome (Bork et al., 1995) during a 5-year period. Nowadays, with the maturity of the whole genome shotgun (WGS) sequencing strategy and large-scale collaboration and data-handling abilities, bacterial genome sequencing is becoming routine. Owing to their small genomes and clinical and agricultural importance, the mollicutes were among the first organisms whose complete genomes were sequenced.

20

The complete genome sequence of Mycoplasma genitalium was the second released complete genome (Fraser et al., 1995). Until August 2004, 10 mollicute genomes have been completely sequenced (Table 1.1), spanning the genera of Mycoplasma, Ureaplasma, Mesoplasma and Candidatus Phytoplasma. The accumulation of mollicute genome data continues with several on-going projects. Mycoplasma genome projects include the rodent polyarthritis pathogen M. arthritidis and the contagious caprine pleuropneumonia (CCPP) pathogen M. capricolum. Spiroplasmas genome projects include the citrus stubborn spiroplasma S. citri BR3-3x strain and the corn stunt spiroplasma S. kunkelii CR2-3x strain (http://www.genome.ou.edu/spiro.html). Survey sequencing has been published for S. kunkelii M2 strain, a close relative of CR2-3x strain (Bai and Hogenhout, 2002) and an 85-kb genome region was reported for S. kunkelii CR2-3x strain (Zhao et al., 2003). The proposed genome sequencing project for another mollicute, Spiroplasma melliferum, is expected to provide information for comparative genomics of this bee pathogen with plant pathogenic spiroplasmas. The application of pulsed field gel electrophoresis (PFGE) in DNA separation revealed a few interesting phytoplasma genome features. The sizes of phytoplasma genomes vary considerably, ranging from 530 to 1,350 kb (Neimark and Kirkpatrick, 1993; Marcone et al., 1999), which are close to those of mycoplasmas species (580-1,300 kb) but smaller than the closest relatives of acholeplasmas (~ 1,600 kb) (Razin et al., 1998). The Bermuda grass white leaf phytoplasma has a genome size of 530 kb, which is even smaller than the genome size (580 kb) of M. genitalium that was thought to be the

21

living cell harboring the smallest genome (Mushegian and Koonin, 1996). Phytoplasmas contain one circular double-stranded chromosomal DNA molecule (Neimark and Kirkpatrick, 1993) and one or more short circular extrachromosomal DNAs (Lee et al., 2000). The GC contents of phytoplasma chromosomal DNA were estimated to be between 23 and 29% based on buoyant density centrifugation (Kollar and Seemüller, 1989) and recently obtained genome sequence data (Oshima et al., 2002; Oshima et al., 2004). Physical maps of genomic DNA have been reported for Western X-disease phytoplasma (Firrao et al., 1996), apple proliferation phytoplasma (Lauer and Seemüller, 2000), sweet potato little leaf phytoplasma (Marcone and Seemüller, 2001) and European stone fruit yellows phytoplasma (Padovan et al., 2000). Much progress has been made in phytoplasma genome sequencing efforts. So far, complete genome sequence has been reported for OY phytoplasma (Oshima et al., 2002; Oshima et al., 2004). Sample genome sequences have been reported for Western X (WX) phytoplasma (Liefting and Kirkpatrick, 2003) and a complete genome-sequencing project of WX phytoplasma is underway. Genome sequencing projects are currently ongoing for three other phytoplasma species, aster yellows witches' broom (AY-WB) phytoplasma (http://www.oardc.ohio-state.edu/phytoplasma), MBSP, and beet leafhopper transmitted virescence agent (BLTVA). The complete AY-WB phytoplasma genome sequence is available and the final annotation is underway. There are also several phytoplasmas whose genome sequencing projects are on the priority list of the American Phytopathological Society, including clover phyllody (CPh) phytoplasma, elm yellows (EY) phytoplasma, and potato witches' broom (PWB) phytoplasma. These sequencing

22

projects are expected to begin soon. The genome analysis of phytoplasmas should provide more information on genes, which is important in understanding the pathogenicity to plant hosts and reproduction in both insect vector and plant host cells.

1.9 Comparative genomics

Comparative genomics provides an opportunity to identify the common functional contents of genomics and study the evolutionary relationships between two or more organisms. The whole genome comparison was first done between the first two completely sequenced bacteria M. genitalium (Fraser et al., 1995) and H. influenzae (Fleischmann et al., 1995), which resulted in the identification of the minimal gene complement of a free-living cell (Mushegian and Koonin, 1996). The concept that mycoplasmas are the smallest living cell appeared in an article published in Scientific America in 1962 (Morowitz and Tourtellotte, 1962). The validation of this concept with identified genes is only possible by comparative genomics. H. influenzae is a gramnegative bacterium with a 1.8 Mb genome (Fleischmann et al., 1995) and M. genitalium is a gram-positive bacterium with a genome of 580 kb, the smallest genome sequenced so far (Fraser et al., 1995). The genes conserved in these two very distantly related bacteria are essential for cellular functions. The comparison resulted in 240 ORFs that are conserved in both genomes. Considering the sequence difference of genes having the same functions and the functional redundancy, a final set of 256 ORFs was considered the minimal set of genes essential for a free-living cell (Mushegian and Koonin, 1996).

23

However, the living environment of the cell has to be taken into account as mycoplasmas live a parasitic life absorbing nutrients from the hosts. Comparative analysis of the M. pneumoniae and M. genitalium genomes (Himmelreich et al., 1997) revealed several interesting features about these two organisms and mollicutes in general. First, all ORFs in M. genitalium are also present in the M. pneumoniae genome. Second, each of the two genomes has 6 segments, the orders of which are not conserved because of translocation via homologous recombination. But the orthologous genes within each segment are well conserved. Third, an additional 236 kb in the M. pneumoniae genome encodes ORFs in three categories, 1) 110 ORFs that are unique to M. pneumoniae; 2) 76 ORFs that are repeated in the M. pneumoniae genome are in single copies in the M. genitalium genome; 3) 23 ORFs encoding repetitive sequences that were not annotated in M. genitalium. This study demonstrated, for the first time, the usefulness of comparative genomics of closely related organisms. The genome sequence data have steadily accumulated for plant pathogenic mollicutes over the past several years. OY phytoplasma complete genome has been reported (Oshima et al., 2004). S. kunkelii and AY-WB phytoplasma genome are being sequenced and the gapped genome data are available from websites. S. kunkelii and AYWB phytoplasma are distantly related, belonging to two different branches of the mollicutes (See 'Phylogeny and evolution' above). On the other hand, S. kunkelii and AYWB phytoplasma are both insect-transmitted plant pathogens. They both invade and replicate in the same tissues and cells in insect vectors and plant hosts, and they both cause physiological changes of the plant hosts. This brought up the hypothesis that S.

24

kunkelii and AY-WB phytoplasma share some common genes related to insect transmission and/or plant pathogenicity. Further, since mycoplasmas are animal and human pathogens and have no insect vectors, the genes shared by S. kunkelii and AY-WB phytoplasma may not have homologs in mycoplasmas. Based on this hypothesis, a comparative genomics study was conducted on gapped genomes of S. kunkelii and AY-WB phytoplasma, and complete genomes of M. genitalium, M. pneumoniae, M. pulmonis, M. gallisepticum, M. penetrans, U. urealyticum, OY phytoplasma, and M. mycoides subsp. mycoides SC (Bai et al., 2004b). Four deduced proteins were identified by implementation of the BLAST strategy (Altschul et al., 1997) and designed programs, including polynucleotide phosphorylase (PNPase), cmp-binding factor (CBF), cytosine deaminase, and YlxR protein. PNPase is widely distributed among eukaryotic and prokaryotic organisms and was shown to be a global regulator of virulence factors of Salmonella enterica (Clements et al., 2002). It was speculated that PNPase could have a similar function in plant pathogenic mollicutes (Bai et al., 2004b). Another deduced protein, CBF, could be involved in plasmid replication as in Streptococcus aureus (Zhang et al., 1997). Plant pathogenic mollicutes harbor plasmids containing virulence factors (Melcher et al., 1999; Oshima et al., 2002), which implied the involvement of CBF in pathogenicity (Bai et al., 2004b). In addition to the functional information, some evolutionary data were also obtained from the study (Bai et al., 2004b). Four deduced proteins were shared among all organisms included in the study. These proteins are ppGpp synthetase, HAD hydrolase, AAA type ATPase, and P-type magnesium transport ATPase. Interestingly, they are more

25

closely related between S. kunkelii and AY-WB phytoplasma than to mycoplasmas. Phylogenetic analysis suggested AAA type ATPase was obtained by phytoplasmas from spiroplasmas via horizontal gene transfer. The locations of the gene within both genomes are in insertion sequence regions (Bai et al., 2004b). The application of comparative genomics is getting more common as more genome sequence data become available. Nowadays, comparative genomics is practiced together with the genome sequencing of almost all organisms, both eukaryotes and prokaryotes. Comparative genome analysis provides useful information about the functional implications of interesting genes and clues about evolution.

1.10 Functional genomics

The computer algorithm-assisted annotation assigns functions to genome sequences. However, the findings are not conclusive until supported by experimental data. Functional genomics aims to study the functions of genome sequences on the genome level using high-throughput strategies. Several high-throughput analysis techniques have been developed and successfully applied to determine gene functions including in vivo expression technology (IVET) (Mahan et al., 1993), signature-tagged mutagenesis (STM) (Walsh and Cepko, 1992), and genomic analysis and mapping by in vitro transposition (GAMBIT) (Akerley et al., 1998). These techniques depend on the generation of noticeable phenotypes to identify genes (Chiang et al., 1999). However, the application of these techniques to plant

26

pathogenic mollicutes is limited, if not at all applicable, because of the lack of efficient transformation tools. So far, it is only possible for S. citri (Foissac et al., 1997b). Because phytoplasmas cannot be cultured, functional analysis tools, including data mining for the identification of protein candidates, in planta functional screens for the identification of protein effectors, and studies focusing on several proteins for the elucidation of the functions of the proteins were used and resulted in the elucidation of several candidate virulence factors of phytoplasmas (Bai et al., in preparation). At the time of the start of the project, the AY-WB phytoplasma genome sequence was not yet completed. However, the gapped genome sequences have been shown to be able to give a decent analysis result equivalent to a complete sequence (Selkove et al., 2000). The gapped AY-WB phytoplasma genome was mined for effector proteins. The underlying hypothesis was that the secreted proteins and membrane-bound proteins from AY-WB phytoplasma are most likely to be involved in pathogenicity. Because phytoplasmas are intracellular pathogens to insects and plants (Oshima et al., 2002), these proteins have better chances to contact host cells. Proteins can be secreted via several bacterial transport systems, including the type II Sec-dependent secretion pathway (Lai and Kado, 2000). There are no indications of the presence of secretion pathways other than type I and type II in the annotated genome of AY-WB phytoplasma (Bai et al., in preparation). Proteins transported by the Sec-dependent secretion pathway have a common feature, the presence of signal peptides (Fekkes and Driessen, 1999). Three distinct regions comprise the N-terminal signal sequence: the charged N-terminus (nregion), the hydrophobic core (h-region), and the C-terminal cleavage domain (c-region)

27

(von Heijne, 1985). SignalP program was developed to predict the presence of signal peptides based on algorithms of Neural Network (NN) and Hidden Markov Model (HMM) (Nielsen et al., 1997a, 1997b). The incorporation of SignalP program, an ORF prediction program, ORF Extractor (Bai et al., 2004b), and some perl scripts in the research identifies 144 candidate effector proteins, which are subject to the in planta functional analysis. In planta assays were adapted for the functional studies of AY-WB phytoplasma. The Potato Virus X (PVX)-based binary plant transformation vector is an effective molecular tool for transient expression of foreign genes in a plant system. It has also been used for virus-induced gene silencing (VIGS) in plants (Ruiz et al., 1998). Recently, the system has been exploited as a high-throughput functional screening tool (Qutob et al., 2002; Kamoun et al., 2002; Kamoun et al., 2003; Torto et al., 2002; Torto et al., 2003). Using this high-throughput strategy on Nicotiana benthamiana plants, 16 phytoplasma proteins were identified to either induce the severe necrosis symptoms by themselves or manipulate the plant defense system resulting in the increase of PVX symptoms.

In-depth functional analysis was conducted focusing on several proteins. These proteins were selected based on the computer-predicted presence of a nuclear localization signal (NLS), which is specific for eukaryotic proteins. Since AY-WB phytoplasma is a prokaryote without a nucleus, phytoplasma proteins with NLS are putative virulence factors because they may target the nuclei of the insect and plant host cells and affect host gene transcription. Indeed, two of the proteins (A11 and A30) were demonstrated to

28

target to N. benthamiana nuclei demonstrated by transient expression of yellow fluorescence protein (YFP) fusion proteins in plant leaves. The transportation of the YFP fusion proteins into plant nuclei could be via importin-dependent pathway. Two N. benthamiana importin α homologs were identified by data mining strategy (Kanneganti et al., in preparation). These importin α homologs were silenced in N. benthamiana plant by VIGS via TRV (tobacco rattle virus) system (Ratcliff et al., 2001; Dinesh-Kumar et al., 2003). The transportation of the fusion proteins was disrupted in importin α-silence N. benthamiana plants, which suggested the dependence of the importin pathway. The yeast two-hybrid system is employed to detect whether the proteins directly interact with importin α of N. benthamiana. The genes encoding the two proteins were expressed by AY-WB phytoplasma during infection of insects and plants. Transcripts of expected sizes were detectable in total RNA isolated from infected insects and plants. One protein A11 was produced in FLAG-tagged format in Escherichia coli XL1-blue strain and purified by affinity columns. Antibody against FLAG-tagged A11 was raised in mice and was used for immuno-labeling studies. Confocal microscopy images of immuno-fluorescence-labeled AY-WB phytoplasma-infected plant and insect tissues revealed that A11 protein is present in plant phloem tissues and various insect tissues, including those important for insect transmission, such as midgut and salivary gland.

29

1.11 Research objectives

The research objectives of my Ph.D. research are to identify and characterize spiroplasma and phytoplasma genes involved in insect transmission and plant pathogenicity using various genomic tools, including genome sequencing, sequence annotations, comparative genomics, and functional genomics. This dissertation consists of six chapters, including this introduction. Each chapter corresponds to one project contributing to the research objectives. Chapter 1 is this introduction, conveying the background knowledge and the research summary of the plant pathogenic mollicutes. Chapter 2 is titled "A genome sequence survey of the mollicute corn stunt spiroplasma Spiroplasma kunkelii", which was published in FEMS Microbiology Letters (Bai and Hogenhout, 2002). It reports the survey-sequencing attempt on S. kunkelii M2 strain and some revealed features of the genome. Chapter 3 is titled "Complete genome sequences of aster yellows witches' broom (AY-WB) phytoplasma and comparison with onion yellows (OY) phytoplasma". It summarizes the results from AY-WB phytoplasma genome sequencing project, including genome data, genome annotation, metabolic pathway reconstruction, and comparative genomics. Chapter 4 is titled "Identification and characterization of traE genes of Spiroplasma kunkelii", which was published in Gene (Bai et al., 2004a). It contains the gene identification from S. kunkelii gapped genome sequences, gene sequence analysis, and characterization of the traE genes in S. kunkelii. Chapter 5 is titled "Comparative genomics identifies genes shared by distantly related insect-transmitted plant pathogenic mollicutes", which was published in FEMS

30

Microbiology Letter (Bai et al., 2004b). It reports the in silico comparative genomics study employed for the identification of potential pathogenicity-related genes in plant pathogenic mollicutes. The complete genomes of animal and human pathogenic mycoplasmas were used in the study. Chapter 6 is titled "Functional genomics identifies phytoplasma effector proteins". It includes the sections of the data mining for effector proteins, high throughput functional screen, effector proteins localization and transportation in plants, effector gene expression during infection of insects and plants, and the function of effector proteins in plants.

1.12 References Akerley, B.J., Rubin, E.J., Camilli, A., Lampe, D.J., Robertson, H.M. and Mekalanos, J.J. (1998) Systematic identification of essential genes by in vitro mariner mutagenesis. Proc. Natl. Acad. Sci. USA 95, 8972-8932. Alma, A., Bosco, D., Danielli, A., Bertaccini, A., Vibio, M. and Arzone, A. (1997) Identification of phytoplasmas in eggs, nymphs and adults of Scaphoideus titanus Ball reared on healthy plants. Insect Mol. Biol. 6, 115-121. Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W. and Lipman, D.J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389-3402. Ammar, El-D., Fulton, D., Bai, X., Meulia, T. and Hogenhout, S.A. (2004) An attachment tip and pili-like structures in insect- and plant-pathogenic spiroplasmas of the class Mollicutes. Arch. Microbiol. 181, 97-105. Anbutsu, H. and Fukatsu, T. (2003) Population dynamics of male-killing and non-malekilling spiroplasmas in Drosophila melanogaster. Appl. Environ. Microbiol. 69, 1428-1434.

31

André, A., Maccheroni, W., Doignon, F., Garnier, M. and Renaudin J. (2003) Glucose and trehalose PTS permeases of Spiroplasma citri probably share a single IIA domain, enabling the spiroplasma to adapt quickly to carbohydrate changes in its environment. Microbiology 149, 2687-2696. Bai, X., Fazzolari, T. and Hogenhout, S.A. (2004a) Identification and characterization of traE genes of Spiroplasma kunkelii. Gene 336, 81-91. Bai, X. and Hogenhout, S.A. (2002) A genome sequence survey of the mollicute corn stunt spiroplasma Spiroplasma kunkelii. FEMS Microbiol. Lett. 210, 7-17. Bai, X., Zhang, J., Holford, I.R. and Hogenhout, S.A. (2004b) Comparative genomics identifies genes shared by distantly related insect-transmitted plant pathogenic mollicutes. FEMS Microbiol. Lett. 235, 249-258. Beanland, L., Hoy, C.W., Miller, S.A. and Nault, L.R. (2000) Influence of aster yellows phytoplasma on the fitness of aster leafhopper (Homoptera: Cicadellidae). Ann. Entomol. Soc. Am. 93, 271-276. Berg, H.C. (2002) How Spiroplasma might swim. J. Bacteriol. 184, 2063-2064. Berg, M. Melcher, U. and Fletcher, J. (2001) Characterization of Spiroplasmas citri adhesion related protein SARP1, which contains a domain of a novel family designated sarpin. Gene 275, 57-64. Bork, P., Ouzounis, C., Casari, G., Schneider, R., Sander, C., Dolan, M., Gilbert, W. and Gillevet, P.M. (1995) Exploring the Mycoplasma capricolum genome: a minimal cell reveals its physiology. Mol. Microbiol. 16, 955-967. Boutareaud, A., Danet, J.L., Garnier, M. and Saillard, C. (2004) Disruption of a gene predicted to encode a solute bining protein of an ABC transporter reduces transmission of Spiroplasmas citri by the leafhopper Circulifer haematoceps. Appl. Environ. Microbiol. 70, 3960-3967. Bové, J.M. (1997) Spiroplasmas: infectious agents of plants, arthropods and vertebrates. Wien. Klin. wochenschr. 109, 604-612. Bové, J.M., Renaudin, J., Saillard, C., Foissac, X. and Garnier, M. (2003) Spiroplasma citri, a plant pathogenic mollicute: relationships with its two hosts, the plant and the leafhopper vector. Annu. Rev. Phytopathol. 41, 482-500. Braun, E.J. and Sinclair, W.A. (1978) Translocation in phloem necrosis-diseased American elm seedlings. Phytopathology 68, 1733-1737.

32

Castano, S., Blaudez, D., Desbat, B., Dufourcq, J. and Wrobleski, H. (2002) Secondary structure of spiralin in solution, at the air/water interface, and in interaction with lipid monolayers. Biochim. Biophys. Acta. 1562, 45-56. Catlin, P.B., Olson, E.A. and Beutel, J.A. (1975) Reduced translocation of carbon and nitrogen from leaves with symptoms of pear curl. J. Am. Soc. Hortic. Sci. 100, 184187. Chambaud, I., Heilig, R., Ferris, S., Barbe, V., Samson, D., Galisson, F., Moszer, I., Dybvig, K., Wroblewski, H., Viari, A., Rocha, E.P.C. and Blanchard, A. (2001) The complete genome sequence of the murine respiratory pathogen Mycoplasma pulmonis. Nucleic Acids Res. 29, 2145-2153. Chang, C.J. (1989) Nutrition and cultivation of spiroplasmas. In: The Mycoplasmas (Whitcomb, R.F. and Tully, J.G. Ed.). Vol. 5, pp. 201-241. Academic Press, New York, NY. Chang, C.-J. (1998) Pathogenicity of aster yellows phytoplasma and Spiroplasma citri on periwinkle. Phytopathology 88, 1347-1350. Chang, C.-J. and Lee, I.-M. (1995) Pathogenesis of diseases associated with mycoplasma-like organisms. In: Pathogenesis and Host Specificity in Plant Diseases (Singh, U.S., Singh, R.P. and Kohmoto, K. Ed.). Vol. 1, pp. 237-246. Elsevier Science Publishing Co., New York, NY. Charbonneau, D.L. and Ghiorse, W.C. (1984) Ultrastructure and location of cytoplasmic fibrils in Spiroplasma floricola. Curr. Microbiol. 10, 65-72. Chastel, C. and Humphery-Smith, I. (1991) Mosquito spiroplasma. In: Advances in disease vector research (Harris, K.F. Ed.). vol. 7, pp. 149-206. Springer-Verlag Inc., New York, NY. Chevalier, C., Saillard, C. and Bové, J.M. (1990) Organization and nucleotide sequences of the Spiroplasma citri genes of ribosomal protein S2, elongation factor Ts, spiralin, phosphofructokinase, pyruvate kinase, and an unidentified protein. J. Bacteriol. 172, 2693-2703. Chiang, S.L., Mekalanos, J.J. and Holden, D.W. (1999) In vivo genetic analysis of bacterial virulence. Annu. Rev. Microbiol. 53, 129-154. Chiykowski, L.N. and Sinha, R.C. (1990) Differentiation of MLO diseases by means of symptomology and vector transmission. Zbl. Bakt. Suppl. 20, 280-287.

33

Clements, M.O., Eriksson, S., Thompson, A., Lucchini, S., Hinton, J.C., Normark, S. and Rhen, M. (2002) Polynucleotide phosphorylase is a global regulator of virulence and persistency in Salmonella enterica. Proc. Natl. Acad. Sci. USA 99, 8784-8789. Daniels, M.J., Longland, J.M. and Gilbart, J. (1980) Aspects of motility and chemotaxis in spiroplasmas. J. Gen. Microbiol. 118, 429-436. Daniels, M.J. (1979) A simple technique for assaying certain microbial phytotoxins and its application to the study of toxins by Spiroplasma citri. J. Gen. Microbiol. 114, 323-328. Daniels, M.J. (1983) Mechanisms of spiroplasma pathogenicity. Annu. Rev. Phytopathol. 21, 29-43. Dinesh-Kumar, S.P., Anandalakshmi, R., Marathe, R., Schiff, M. and Liu, Y. (2003) Virus-induced gene silencing. Methods Mol. Biol. 236, 287-294. Doi, Y., Teranaka, M., Yora, K. and Asuyama, H. (1967) Mycopalsma- or PLT grouplike microorganisms found in the phloem elements of plants infected with mulberry dwarf, potato witches' broom, aster yellows, or paulownia witches' broom. Ann. Phytopathol. Soc. Jpn. 33, 259-266. Doi, M., Wachi, M., Ishino, F., Tomioka, S., Ito, M., Sakagami, Y., Suzuki, A. and Matsuhashi, M. (1988) Determination of the DNA sequence of the mreB gene and of the gene products of the mre region that function in formation of the rod shape of Escherichia coli cells. J. Bacteriol. 170, 4619-4624. Duret, S., Berho, N., Danet, J.L., Garnier, M. and Renaudin, J. (2003) Spiralin is not essential for helicity, motility, or pathogenicity but is required for efficient transmission of spiroplasma citri by its leafhopper vector Circulifer haematoceps. Appl. Environ. Microbiol. 69, 6225-6234. Ebbert, M.A. and Nault, L.R. (1994) Improved overwintering ability in Dalbulus maidis (Homoptera: Cicadellidae) vectors infected with Spiroplasma kunkelii (Mycoplasmatales: Spiroplasmataceae). Environ. Entomol. 23, 634-644. Falah, M. and Gupta, R.S. (1997) Phylogenetic analysis of mycoplasmas based on Hsp70 sequences: cloning of the dnaK (hsp70) gene region of Mycoplasma capricolum. Int. J. Syst. Bacteriol. 47, 38-45. Fekkes, P. and Driessen, A.J. (1999) Protein targeting to the bacterial cytoplasmic membrane. Mol. Biol. Rev. 63, 161-173.

34

Firrao, G., Smart, C.D. and Kirkpatrick, B.C. (1996) Physical map of the Western Xdisease phytoplasma chromosome. J. Bacteriol. 178, 3985-3988. Fleischmann, R.D., Adams, M.D., White, O., Clayton, R.A., Kirkness, E.F., Keriavage, A.R., Bult, C.J., Tomb, J.-F., Dougherty, B.A., Merrick, J.M., McKenney, K., Sutton, G., FitzHugh, W., Fields, C., Gocayne, J.D., Scott, J., Shirley, R., Liu, L.-I., Glodek, A., Kelley, J.M., Weidman, J.F., Phillips, C.A., Spriggs, T., Hedblom, E., Cotton, M.D., Utterback, T.R., Hanna, M.C., Nguyen, D.T., Saudek, D.M., Brandon, R.C., Fine, L.D., Fritchman, J.L., Fuhrmann, J.L., Geoghagen, N.S.M., Gnehm, C.L., McDonald, L.A., Small, K.V., Fraser, C.M., Smith, H.O. and Venter, J.C. (1995) Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science 269, 496-512. Fletcher, J., Wayadande, A.C., Melcher, U. and Ye, F. (1998) The phytopathogenic mollicute-insect vector interface: A closer look. Phytopathology 88, 1351-1358. Foissac, X., Bové, J.M. and Saillard, C. (1997a) Sequence analysis of Spiroplasma phoeniceum and Spiroplasma kunkelii spiralin genes and comparison with other spiralin genes. Curr. Microbiol. 35, 240-243. Foissac, X., Danet, J.L., Saillard, C., Gaurivaud, P., Laigret, F., Pare, C. and Bové, J.M. (1997b) Mutagenesis by insertion of Tn4001 into the genome of Spiroplasma citri: Characterization of mutants affected in plant pathogenicity and transmission to the plant by the leafhopper vector Circulifer haematoceps. Mol. Plant-Microbe Interact. 10, 454-461. Fraser, C.M., Gocayne, J.D., White, O., Adams, M.D., Clayton, R.A., Fleischmann, R.D., Bult, C.J., Kerlavage, A.R., Sutton, G.G., Kelley, J.M., Fritchman, J.L., Weidman, J.F., Small, K.V., Sandusky, M., Fuhrmann, J.L., Nguyen, D.T., Utterback, T., Saudek, D.M., Phillips, C.A., Merrick, J.M., Tomb, J., Dougherty, B.A., Bott, K.F., Hu, P.C., Lucier, T.S., Peterson, S.N., Smith, H.O. and Venter, J.C. (1995) The minimal gene complement of Mycoplasma genitalium. Science 270, 397-403. Fukatsu, T., Tsuchida, T., Nikoh, N. and Koga, R. (2001) Spiroplasma symbiont of the pea aphid, Acyrthosiphon pisum (Insecta: Homoptera). Appl. Environ. Microbiol. 67, 1284-1291. Gabridge, M.G., Chandler, K.F. and Daniels, M.J. (1985) Pathogenicity factors in mycoplasmas and spiroplasmas. In: The Mycoplasmas (Razin, S. and Barile, M.F. Ed.). Vol. 4, pp. 313-351. Academic Press, New York, NY. Gasparich, G.E. (2002) Spiroplasmas: evolution, adaptation and diversity. Front Biosci. 7, 619-640.

35

Gasparich, G.E., Whitcomb, R.F., Dodge, D., French, F.E., Glass, J. and Williamson, D.L. (2004) the genus Spiroplasma and its non-helical descendants: phylogenetic classification, correlation with phenotype and roots of the Mycoplasma mycoides clade. Int. J. Syst. Evol. Microbiol. 54, 893-918. Gaurivaud, P., Danet, J.L., Laigret, F., Garnier, M. and Bové, J.M. (2000a) Fructose utilization and phytopathogenicity of Spiroplasma citri. Mol. Plant-Microbe Interact. 13, 1145-1155. Gaurivaud, P., Laigret, F., Garnier, M. and Bové, J.M. (2000b) Fructose utilization and pathogenicity of Spiroplasma citri: characterization of the fructose operon. Gene 252, 61-69. Gaurivaud, P., Laigret, F., Garnier, M. and Bové, J.M. (2001) Characterization of FruR as a putative activator of the fructose operon of Spiroplasma citri. FEMS Microbiol. Lett. 198, 73-78. Geigenberger, P., Lerchl, J., Stitt, M. and Sonnewald, U. (1996) Phloem-specific expression of pyrophosphatase inhibits long-distance transport of carbohydrate and amino acids in tobacco plants. Plant Cell Environ. 19, 43-55. Gilad, R., Porat, A. and Trachtenberg, S. (2003) Motility modes of Spiroplasma melliferum BC3: a helical, wall-less bacterium driven by a linear motor. Mol. Microbiol. 47, 657-669. Glass, J.I., Lefkowitz, E.J., Glass, J.S., Heiner, C.R., Chen, E.Y. and Cassell, G.H. (2000) The complete sequence of the mucosal pathogen Ureaplasma urealyticum. Nature 407, 757-762. Gundersen, D.E., Lee, I.-M., Schaff, D.A., Harrison, N.A., Chang, C.J., Davis, R.E. and Kingsbury, D.T. (1996) Genomic diversity and differentiation among phytoplasma strains in 16S rRNA groups I (aster yellows and related phytoplasmas) and III (Xdisease and related phytoplasmas), Int. J. Syst. Bacteriol. 46, 64-75. Gussie, J.S., Fletcher, J. and Claypool, P.L. (1995) Movement and multiplication of Spiroplasma kunkelii in corn. Phytopathology 85, 1093-1098. Hackett, K.J. and Clark, T.B. (1989) Ecology of spiroplasmas. In: The Mycoplasmas (Whitcomb, R.F. and Tully, J.G. Ed.), Vol. V, Spiroplasmas, Acholeplasmas, and Mycoplasmas of Plants and Arthropods. pp. 113-200. Academic Press, San Diego, CA.

36

Harris, K.F. (1979) Leafhoppers and aphids as biological vectors: Vector-virus relationships. In: Leafhopper Vectors and Plant Disease Agents (Maramorosch, K. and Harris, K.F. Ed.). pp. 217-308. Academic Press, New York, NY. Henriquez, P., Jeffers, D. and Seal, S. (1996) Detection of corn stunt mixed infections in Central America using ELISA and PCR techniques. Phytopathology 86, S58. Himmelreich, R., Hilbert, H., Plagens, H., Pirkl, E., Li, B.C. and Herrmann, R. (1996) Complete sequence analysis of the genome of the bacterium Mycoplasma pneumoniae. Nucleic Acids Res. 24, 4420-4449. Himmelreich, R., Plagens, H., Hilbert, H., Reiner, B. and Herrmann, R. (1997) Comparative analysis of the genomes of the bacteria Mycoplasma pneumoniae and Mycoplasma genitalium. Nucleic Acids Res. 25, 701-712. Hoy, C.W., Heady, S.E. and Koch, T.A. (1992) Species composition, phenology, and possible origins of leafhoppers (Cicadellidae) in Ohio vegetable crops. J. Econ. Entomol. 85, 2336-2343. Hruska, A.J. and Gomez-Peralta, M. (1997) Maize response to corn leafhopper (Homoptera: Cicadellidae) infestation and achaparramiento disease. J. Econ. Entomol. 90, 604-610. IRPCM Phytoplasma/Spiroplasma Working Team – Phytoplasma Taxonomy Group. (2004) 'Candidatus Phytoplasma', a taxon for the wall-less, non-helical prokaryotes that colonize plant phloem and insects. Int. J. Syst. Evol. Microbiol. 54, 1243-1255. Jacob, C., Nouzieres, F., Duret, S., Bové, J.M. and Renaudin, J. (1997) Isolation, characterization, and complementation of a motility mutant of Spiroplasma citri. J. Bacteriol. 179,4802-4810. Jaffe, J.D., Stange-Thomann, N., Smith, C., DeCaprio, D., Fisher, S., Butler, J., Calvo, S., Elkins, T., FitzGerald, M.G., Hafez, N., Kodira, C.D., Major, J., Wang, S., Wilkinson, J., Nicol, R., Nusbaum, C., Birren, B., Berg, H.C. and Church, G.M. (2004) The complete genome and proteome of Mycoplasma mobile. Genome Res. 14, 1447-1461. Jensen, D.D. (1959) A plant virus lethal to its vector. Virology 8, 164-175. Jones, L.J., Carballido-Lopez, R. and Errington, J. (2001) Control of cell shape in bacteria: helical, actin-like filaments in Bacillus subtilis. Cell 104, 913-922.

37

Kamla, V., Henrich, B. and Hadding, U. (1996) Phylogeny based on elongation factor Tu reflects the phenotypic features of mycoplasmas better than that based on 16S rRNA. Gene 171, 83-87. Kamoun, S., Dong, S., Hamada, W., Huitema, E., Kinney, D., Morgan, W.R., Styer, A., Testa, A. and Torto, T.A. (2002) From sequence to phenotype: functional genomics of Phytophthora. Can. J. Plant Pathol. 24, 6-9. Kamoun, S., Hamada, W. and Huitema, E. (2003) Agrosuppression: a bioassay for the hypersensitive response suited to high-throughput screening. Mol. Plant-Microbe Interact. 16, 7-13. Kawakita, H., Saiki, T., Wei, W. Mitsuhashi, W., Watanabe, K. and Sato, M. (2000) Identification of mulberry dwarf phytoplasmas in the genital organs and eggs of leafhopper Hishimonoides sellatiformis. Phytopathology 90, 909-914. Kirchhoff, H. (1992) Motility. In: Mycoplasmas: molecular biology and pathogenesis (Maniloff, J., McElhaney, R.N., Finch, L.R. and Baseman, J.B. Ed.). pp. 289-306. Am. Soc. Microbiol. Washington, DC. Kirkpatrick, B.C. (1989) In Plant-Microbe Interactions: Molecular and Genetics Perspectives (Nester, E.W. Ed.), vol. 3, pp. 241-293. McGraw-Hill, New York, NY. Kollar, A. and Seemüller, E. (1989) Base composition of the DNA of Mycoplasma-like organisms associated with various plant diseases. Phytopathology 127, 177-186. Lai, C.-M. and Kado, C.I. (2000) The T-pilus of Agrobacterium tumefaciens. Trends Microbiol. 8, 361-369. Lartigue, C., Duret, S., Garnier, M. and Renaudin, J. (2002) New plasmid vectors for specific gene targeting in Spiroplasma citri. Plasmid 48, 149-159. Lauer, U. and Seemüller, E. (2000) Physical map of the chromosome of the apple proliferation phytoplasma. J. Bacteriol. 182, 1415-1418. Lee, I.-M., Davis, R.E. and Gundersen-Rindal, D.E. (2000) Phytoplasma: phytopathogenic mollicutes. Annu. Rev. Microbiol. 54, 221-255. Lee, I.-M. Gundersen-Rindal, D.E., Davis, R.E., Bottner, K.D., Marcone, C. and Seemüller, E. (2004) 'Candidatus Phytoplasma asteris', a novel phytoplasma taxon associated with aster yellows and related diseases. Int. J. Syst. Evol. Microbiol. 54, 1037-1048.

38

Lepka, P., Stitt, M., Moll, E. and Seemüller, E. (1999) Effect of phytoplasmal infection on concentration and translocation of carbohydrates and amino acids in periwinkle and tobacco. Physiol. Mol. Plant Pathol. 55, 59-68. Liefting, L.W. and Kirkpatrick, B.C. (2003) Cosmid cloning and sample sequencing of the genome of the uncultivable mollicute, Western X-disease phytoplasma, using DNA purified by pulsed-field gel electrophoresis. FEMS Microbiol. Lett. 221, 203211. Lim, P.-O. and Sears, B.B. (1992) Evolutionary relationships of plant-pathogenic mycoplasmalike organism and Acholeplasma laidlawii deduced from two ribosomal protein gene sequences. J. Bacteriol. 174, 2606-2611. Madden, L.V. and Nault, L.R. (1983) Differential pathogenicity of corn stunting mollicutes to leafhopper vectors in Dalbulus and Baldulus species. Phytopathology 73, 1608-1614. Mahan, M.J., Slauch, J.M. and Mekalanos, J.J. (1993) Selection of bacterial virulence genes that are specifically induced in host tissues. Science 259, 686-688. Maniloff, J. (1996) The minimal cell genome: "on being the right size." Proc. Natl. Acad. Sci. USA 93, 10004-10006. Marcone, C., Neimark, A., Ragozzino, A., Lauer, U. and Seemüller, E. (1999) Chromosome sizes of phytoplasmas composing major phylogenetic groups and subgroups. Phytopathology 89, 805-810. Marcone, C. and Seemüller, E. (2001) A chromosome map of the European stone fruit yellows phytoplasma. Microbiology 147, 1213-1221. Melcher, U., Sha, Y., Ye, F. and Fletcher, J. (1999) Mechanisms of spiroplasma genome variation associated with SpV1-like viral DNA inferred from sequence comparisons. Microb. Comp. Genomics 4, 29-46. Miyata, M. and Petersen, J.D. (2004) Spike structure at the interface between gliding Mycoplasma mobile cells and glass surfaces visualized by rapid-freeze-and-fracture electron microscopy. J. Bacteriol. 186, 4382-4386. Miyata, M., Ryu, W. and Berg, H.C. (2002) Force and velocity of Mycoplasma mobile. J. Bacteriol. 184, 1827-1832. Morowitz, H.J. and Tourtellotte, M.E. (1962) The smallest living cells. Sci. Am. 206, 117-126.

39

Murral, D.J. (1994) M. S. thesis, The Ohio State University. Mushegian, A.R. and Koonin, E.V. (1996) A minimal gene set for cellular life derived by comparison of complete bacterial genomes. Proc. Natl. Acad. Sci. USA. 93, 1026810273. Nault, L.R. (1997) Arthropod transmission of plant viruses: A new synthesis. Ann. Entomol. Soc. Am. 90, 521-541. Nault, L.R., Madden, L.V., Styer, W.E., Triplehorn, B.W., Shambaugh, G.F. and Heady, S.E. (1984) Pathogenicity of corn stunt spiroplasma and maize bushy stunt mycoplasma to their vector, Dalbulus longulus. Phytopathology 74, 977-979. Neimark, H. and Kirkpatrick, B.C. (1993) Isolation and characterization of full-length chromosomes from non-culturable plant-pathogenic Mycoplasma-like organisms. Mol. Microbiol. 7, 21-28. Nielsen, H., Engelbrecht, J., Brunak, S. and von Heijne, G. (1997a) Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Protein Eng. 10, 1-6. Nielsen, H., Engelbrecht, J., Brunak, S. and von Heijne, G. (1997b) A neural network method for identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Int. J. Neural. Sys. 8, 581-599. Oshima, K., Miyata, S., Sawayanagi, T., Kakizawa, S., Nishigawa, H., Jung, H.-Y., Furuki, K., Yanazaki, M., Suzuki, S., Wei, W., Kuboyama, T., Ugaki, M. and Namba, S. (2002) Minimal set of metabolic pathways suggested from the genome of Onion Yellows phytoplasma. J. Gen. Plant Pathol. 68, 225-236. Oshima, K., Kakizawa, S., Nishigawa, H., Jung, H.Y., Wei, W., Suzuki, S., Arashida, R., Nakata, D., Miyata, S., Ugaki, M. and Namba, S. (2004) Reductive evolution suggested from the complete genome sequence of a plant-pathogenic phytoplasma. Nat. Genet. 36, 27-29. Özbek, E., Miller, S.A., Meulia, T. and Hogenhout, S.A. (2003) Infection and replication sites of Spiroplasma kunkelii (Class: Mollicutes) in midgut and Malpighian tubules of the leafhopper Dalbulus maidis. J. Invertebr. Pathol. 82, 167-175. Padovan, A.C., Firrao, G., Schneider, B. and Gibb, K.S. (2000) Chromosome mapping of the sweet potato little leaf phytoplasma reveals genome heterogeneity within the phytoplasmas. Microbiology 146, 893-902.

40

Papazisi, L., Gorton, T.S., Kutish, G., Markham, P.F., Browning, G.F., Nguyen, D.K., Swartzell, S., Madan, A., Mahairas, G. and Geary, S.J. (2003) The complete genome sequence of the avian pathogen Mycoplasma gallisepticum strain R(low). Microbiology (Reading, Engl.) 149, 2307-2316. Purcell, A.H. (1982) Insect vector relationship with procaryotic plant pathogens. Annu. Rev. Phytopathol. 20, 397-417. Purcell, A.H. (1988) Increased survival of Dalbulus maidis Delong & Wolcott, a specialist on maize, on non-host plants infected with mollicute plant pathogens. Entomol. Exp. Appl. 46, 187-196. Purcell, A.H. and Nault, L.R. (1991) Interactions among plant pathogenic prokaryotes, plants, and insect vectors. In: Microbial Mediation of Plant-Herbivore Interactions (Barbosa, P., Krischik, V.A. and Jones, C.G. Ed.) pp. 383-405. John Wiley & Sons, Inc. Indianapolis, IN. Qutob, D., Kamoun, S. and Gijzen, M. (2002) Expression of a Phytophthora sojae necrosis-inducing protein occurs during transition from biotrophy to necrotrophy. Plant J. 32, 361-373. Ratcliff, F., Martin-Hernandez, A.M. and Baulcombe, D.C. (2001) Tobacco rattle virus as a vector for analysis of gene function by silencing. Plant J. 25, 237-245. Razin, S., Yogev, D. and Naot, Y. (1998) Molecular biology and pathogenicity of mycoplasmas. Microbiol. Mol. Biol. Rev. 62, 1094-1156. Renaudin, J., Marais, A., Verdin, E., Duret, S., Foissac, X., Laigret, F. and Bové, J.M. (1995) Integrative and free Spiroplasma citri oriC plasmids: expression of the Spiroplasma phoeniceum spiralin in Spiroplasma citri. J. Bacteriol. 177, 2800-2877. Renaudin, J. (2002) Extrachromosomal elements and gene transfer. In: Molecular Biology and Pathogenicity of Mycoplasmas (Razin, S. and Herrmann, R. Ed.). pp. 347-370. Kluwer Academic/Plenum, New York, NY. Ruiz, M.T., Voinnet, O. and Baulcombe, D.C. (1998) Initiation and maintenance of virusinduced gene silencing. Plant Cell 10, 937-946. Saglio, P., L'hospital, M., Lafleche, D., Dupont, G., Bové, J.M., Tully, J.G. and Freundt, E.A. (1973) Spiroplasma citri gen. and sp. n.: a mycoplasma-like organism associated with 'stubborn' disease of citrus. Int. J. Syst. Bacteriol. 23, 191-204.

41

Saillard, C., Vignault, J.C., Bové, J.M., Raie, A., Tully, J.G., willismdon, D.L., Fos, A., Garnier, M., Gadeau, A., Carle, P. and Whitecomb, R.F. (1987) Spiroplasma phoeniceum sp. nov., a new plant-pathogenic species from Syria. Int. J. Syst. Bacteriol. 37,106-115. Sasaki, Y., Ishikawa, J., Yamashita, A., Oshima, K., Kenri, T., Furuya, K., Yoshino, C., Horino, A., Shiba, T., Sasaki, T. and Hattori, M. (2002) The complete genomic sequence of Mycoplasma penetrans, an intracellular bacterial pathogen in humans. Nucleic Acids Res. 30, 5293-5300. Schneider, B., Gibb, K.S. and Seemüller, E. (1997) Sequence and RFLP analysis of the elongation factor Tu gene used in differentiation and classification of phytoplasmas. Microbiology 143, 3381-3389. Seemüller, E., Garnicer, M. and Schneider, B. (2002) Mycoplasmas of plants and insects. In: Molecular biology and pathogenicity of mycoplasmas (Razin, S. and Herrmann, R. Ed.). vol. 1, pp. 91-115. Kluwer Academic/Plenum Publishers, New York, NY. Selkov, E., Overbeek, R., Kogen, Y., Chu, L., Vonstein, V., Holmes, D., Silver, S., Haselkorn, R. and Fonstein, M. (2000) Functional analysis of gapped microbial genomes: Amino acid metabolism of Thiobacillus ferrooxidans. Proc. Natl. Acad. Sci. USA 97, 3509-3514. Smart, C.D., Schneider, B., Blomquist, C.L., Guerra, L.J., Harrison, N.A., Ahrens, U., Lorenz, K.-H., Seemüller, E. and Kirkpatrick, B.C. (1996) Phytoplasma-specific PCR primers based on sequences of the 16S-23S rRNA spacer region. Appl. Environ. Microbiol. 62, 2988-2993. Thiaucourt, F., Lorenzon, S., David, A. and Breard, A. (2000) Phylogeny of the Mycoplasma mycoides cluster as shown by sequencing of a putative membrane protein gene. Vet. Microbiol. 72, 251-268. Torto, T.A., Rauser, L. and Kamoun, S. (2002) The pipg1 gene of the oomycete Phytophthora infestans encodes a fungal-like endopolygalacturonase. Curr. Genet. 40, 385-390. Torto, T.A., Li, S., Styer, A., Huitema, E., Testa, A., Gow, N.A., van West, P. and Kamoun, S. (2003) EST mining and functional expression assays identify extracellular effector proteins from the plant pathogen Phytophthora. Genome Res. 13, 1675-1685. Trachtenberg, S. (1998) Mollicutes – wall-less bacteria with internal cytoskeletons. J. Struct. Biol. 12, 244-256.

42

Trachtenberg, S. (2004) Shaping and moving a spiroplasma. J. Mol. Microbiol. Biotechnol. 7, 78-87. Trachtenberg, S., Gilad, R. and Geffen, N. (2003) The bacterial linear motor of Spiroplasma melliferum BC3: from single molecules to swimming cells. Mol. Microbiol. 47, 671-697. Tsai, J.H. (1979) Vector transmission of mycoplasmal agents of plant diseases. In: The Mycoplasmas (Whitcomb, R.F. and Tully, J.G. Ed.), Vol. III, Plant and Insect Mycoplasmas. pp. 265-307. Academic Press, New York, NY. Uenoyama, A., Kusumoto, A. and Miyata, M. (2004) Identification of a 349-kilodalton protein (Gli349) responsible for cytadherence and glass binding during gliding of Mycoplasma mobile. J. Bacteriol. 186, 1537-1545. van den Ent, F., Amos, L.A. and Lowe, J. (2001) Prokaryotic origin of the actin cytoskeleton. Nature 413, 39-44. von Heijne, G. (1985) Signal sequences: The limits of variation. J. Mol. Biol. 184, 99105. Walsh, C. and Cepko, C.L. (1992) Widespread dispersion of neuronal clones across functional regions of the cerebral cortex. Science 255, 434-440. Weisburg, W.G., Tully, J.G., Rose, D.L., Petzel, J.P., Oyaizu, H., Yang, D., Mandelco, L., Sechrest, J., Lawrence, T.G., van Etten, J., Maniloff, J. and Woese, C.R. (1989) A phylogenetic analysis of the mycoplasmas: basis for their classification. J. Bacteriol. 171, 6455-6467. Westberg, J., Persson, A., Holmberg, A., Goesmann, A., Lundeberg, J., Johansson, K.E., Pettersson, B. and Uhlen, M. (2004) The genome sequence of Mycoplasma mycoides subsp. mycoides SC type strain PG1T, the causative agent of contagious bovine pleuropneumonia (CBPP). Genome Res. 14, 221-227. Whitecomb, R.F., Chen, T.A., Williamson, D.L., Liao, C., Tully, J.G., Bové, J.M., Mouches, C., Rose, D.L., Coan, M.E. and Clark, T.B. (1986) Spiroplasma kunkelii sp. nov.: characterization of the etiological agent of corn stunt disease. Int. J. Syst. Bacteriol. 36, 170-178. Williamson, D.L., Brink, P.R. and Zieve, G.W. (1984) Spiroplasma fibrils. Isr. J. Med. Sci. 20, 830-835. Williamson, D.L., Renaudin, J. and Bové, J.M. (1991) Nucleotide sequence of the Spiroplasma citri fibril protein gene. J. Bacteriol. 173, 4353-4362. 43

Woese, C.R. (1987) Bacterial evolution. Microbiol. Rev. 51, 221-271. Wolf, M., Muller, T., Dandekar, T. and Pollack, J.D. (2004) Phylogeny of Firmicutes with special reference to Mycoplasma (Mollicutes) as inferred from phosphoglycerate kinase amino acid sequence data. Int. J. Syst. Evol. Microbiol. 54, 871-875. Wolgemuth, C.W., Igoshin, O. and Oster, G. (2003) The motility of mollicutes. Biophys. J. 85, 828-842. Yu, J., Wayadande, A.C. and Fletcher, J. (2000) Spiroplasma citri surface protein P89 implicated in adhesion to cells of the vector Circulifer tenellus. Phytopathology 90, 716-722. Zhang, Q., Soares de Oliveira, S., Colangeli, R. and Gennaro, M.L. (1997) Binding of a novel host factor to the pT181 replication enhancer. J. Bacteriol. 23, 191-204. Zhao, Y., Hammond, R.W., Jomantiene, R., Dally, E.L., Lee, I.M., Jia, H., Wu, H., Lin, S., Zhang, P., Kenton, S., Najar, F.Z., Hua, A., Roe, B.A., Fletcher, J. and Davis, R.E. (2003) Gene content and organization of an 85-kb DNA segment from the genome of the phytopathogenic mollicute Spiroplasma kunkelii. Mol. Genet. Genomics. 269, 592-602. Zhao, Y., Hammond, R.W., Lee, I.M., Roe, B.A., Lin, S. and Davis, R.E. (2004a) Cell division gene cluster in Spiroplasma kunkelii: functional characterization of ftsZ and the first report of fstA in mollicutes. DNA Cell Biol. 23, 127-134. Zhao, Y., Wang, H., Hammond, R.W., Jomantiene, R., Liu, Q., Lin, S., Roe, B.A. and Davis, R.E. (2004b) Predicted ATP-binding cassette systems in the phytopathogenic mollicute Spiroplasma kunkelii. Mol. Genet. Genomics 271, 325-338.

44

Organism

a

Strain

CDs

Genome size (bp)

GC content (mol%)

Total

CDs as hypothetical proteins n/a

Unique hypothetical proteins n/a

tRNAs

rRNA operon

GenBank accession number

Reference

36

1

NC_000908

Fraser et al., 1995 Himmelreich et al., 1996 Glass et al., 2000

45

M. genitalium

G-37

580,074

32

470

CDs with assigned functions n/a

M. pneumoniae

M129

816,394

40

677

333

181

163

37

1

NC_000912

U. urealyticum (U. parvum serovar 3) M. pulmonis

751,719

25.5

613

325

116

172

30

2

NC_002162

963,879

26.6

784

486

92

204

29

1

NC_002771

M. penetrans

ATCC 700970 UAB CTIP HF-2

1,358,633

25.7

1,038

n/a

n/a

n/a

29

1

NC_004432

Chambaud et al., 2001 Sasaki et al., 2002

M. gallisepticum

R

996,422

31

742

469

150

123

33

2

NC_004829

Papazisi et al., 2003

Onion yellows phytoplasma (Ca. Phytoplasma asteris) M. mycoides subsp. mycoides SC M. mobile

OY-M

860,631

28

754

446

51

257

32

2

NC_005303

Oshima et al., 2004

PG1

1,211,703

24

985

59%

14%

27%

30

2

NC_005364

Westberg et al., 2004

163K

777,079

24.9

635

n/a

n/a

n/a

28

1

NC_006908

Jaffe et al., 2004

Mesoplasma florum

L1

793,224

27

683

n/a

n/a

n/a

29

n/a

NC_006055

Pending

AY-WB phytoplasma (Ca. Phytoplasma asteris)

AYWB

706,569

27

673

345

112

216

31

1

Pending

Pending

Table 1.1 Summary of completed mollicute genomes a

Mycoplasma was abbreviated as M. and Ureaplasma was abbreviated as U. The species names in the parenthesis are the new species names. n/a, no information is available.

Entomoplasma Mesoplasma Mycoplasma mycoides Spiroplasm a

SEM

Ureaplasma UG A =T r p

Mycoplasma pneumoniae

Mollicutes

Mycoplasma hominis Mycoplasma sualvi Acholeplasma W L, GL

Ca. Phytoplasma

AAA

Anaeroplasma Asteroleplasma Clostridium

Gram-positive bacteria

Bacillus

Fig. 1.1 Phylogeny of mollicutes based on 16S rDNA sequences. Bacillus serves as the outgroup. All mollicutes were derived from Gram-positive bacteria of the Clostridium group and underwent reductive evolution by loss of the cell wall (WL) and some genes (GL) (Woese, 1987; Weisburg et al., 1989; Oshima et al., 2004). Phytoplasmas obtained a genus name, Candidatus (Ca.) Phytoplasma (Lee et al., 2004). Members of the Mycoplasma mycoides cluster groups together with Spiroplasma, Entomoplasma, and Mesoplasma, but not other mycoplasma species (Razin et al., 1998).

46

CHAPTER 2 A genome sequence survey of the mollicute corn stunt spiroplasma Spiroplasma kunkelii Xiaodong Bai, Saskia A. Hogenhout Department of Entomology, The Ohio State University – Ohio Agricultural Research and Development Center (OARDC), Wooster, OH 44691

47

2.1 Abstract The mollicute corn stunt spiroplasma (Spiroplasma kunkelii) is a leafhoppertransmitted pathogen of maize. Sequencing of the ~1.6-Mb genome of S. kunkelii was initiated to aid understanding the genetic basis of spiroplasma interactions with their plant and leafhopper hosts. In total, 144,712 nucleotides of non-redundant, high-quality S. kunkelii genome sequence were obtained. Sequence tags were searched against the Mycoplasmataceae and Bacillus/Clostridium databases. Results showed that, in addition to spiroplasma phage SpV1 DNA insertions, spiroplasma genomes harbor more purine and amino acid biosynthesis, transcription regulation, cell envelope and DNA transport/binding genes than Mycoplasmataceae genomes. This investigation demonstrates that survey sequencing is an efficient procedure for gene discovery and genome characterization. The results of the S. kunkelii sequencing project are available at the Spiroplasma Web Page at http://www.oardc.ohio-state.edu/spiroplasma/genome.htm.

48

2.2 Introduction The mollicute Spiroplasma kunkelii is a member of the family Spiroplasmataceae within the order Mycoplasmatales. Spiroplasmas are primarily associated with insects and plants in epiphytic, symbiotic or pathogenic interactions. Three spiroplasma species evolved as plant pathogens: the citrus stubborn spiroplasma Spiroplasma citri, the corn stunt spiroplasma (CSS) S. kunkelii, and the periwinkle yellowing spiroplasma Spiroplasma phoeniceum. Plant-pathogenic spiroplasmas are restricted to the sieve tubes of their plant hosts and are transmitted from plant to plant by phloem-feeding leafhoppers in a persistent propagative manner (Nault, 1980; Purcell, 1982; Markham, 1983). CSS is one of the most important threats to maize. Typical symptoms of CSS infection include chlorosis, stunted plants with reduced internode length and proliferation of ears that do not mature (Nault, 1980). Mollicutes are thought to have diverged from a Gram-positive Clostridium-like ancestor and differ phenotypically from other bacteria in their minute size (0.3-0.5 µm) and lack of cell wall (Bové and Garnier, 1997; Razin et al., 1998). Genomes of Mollicutes are smaller in size than those of most other prokaryotes as a result of degenerative or reductive evolution. However, gene loss in spiroplasmas was not as extensive as in other members of the class Mollicutes (Bové and Garnier, 1997; Bové, 1997). Interestingly, the spiroplasma morphology differs from that of other Mollicutes. All members within the genus Spiroplasma have pleomorphic shapes varying from spherical or slightly ovoid, 100-250 nm, to helical fragments that are about 120 nm in diameter and 2-4 µm long during active growth and up to 15 µm in later stages of growth,

49

whereas members of other mollicute genera typically have a spheroidal to ovoid shape and commonly do not have a helical elongated stage in their life cycle. The genomes of Mycoplasma genitalium, M. pneumoniae, Ureaplasma urealyticum and M. pulmonis of the family Mycoplasmataceae within the order Mycoplasmatales have been sequenced to completion (Fraser et al., 1995; Himmelreich et al., 1996; Glass et al., 2000; Chambaud et al., 2001), and close to seven full genome sequences from the Bacillus/Clostridium group, the most closely related walled bacteria to the class Mollicutes, are available as well. The sample-sequencing project of the S. kunkelii genome and subsequent comparison of S. kunkelii sequence data with Mycoplasmataceae species and Bacillus/Clostridium sequence databases as described herein revealed interesting differences in gene content between S. kunkelii and members of the Mycoplasmataceae.

2.3 Materials and methods 2.3.1 Selection of the S. kunkelii strain The S. kunkelii strain CSS-M was selected for genome sequencing. The strain was originally isolated from infected corn plants in Tlaltizapan, Mexico in 1992 (Ebbert and Nault, 2001). It has been maintained at the Ohio Agricultural Research and Development Center (OARDC) by serial transfers with the CSS leafhopper vector (Dalbulus maidis) as described by Ebbert and Nault (Ebbert and Nault, 1994). CSS-M was isolated from infected corn stems, propagated in liquid LD8A3 medium, and plated onto LD8A3 agar plates (Lee and Davis, 1989). A culture derived from a single colony was used for

50

genomic DNA isolation. The S. kunkelii clone was transmitted to maize seedlings (Zea mays L. ‘Early Sunglow’) by D. maidis (Nault, 1980), indicating the clone kept its characteristics of leafhopper transmission and pathogenesis of plants.

2.3.2 Construction of S. kunkelii genomic DNA libraries Genomic DNA was isolated from S. kunkelii using the Qiagen (Valencia, CA, USA) Whole Genomic DNA isolation kit following the manufacturer’s procedures. The isolated genomic DNA was used for library construction. The DNA was digested to completion with EcoRI or HindIII, ligated into appropriately digested, phosphatasetreated pUC18, and transformed into electro-competent XL-blue Escherichia coli (Stratagene, La Jolla, CA, USA) cells. For construction of a random sheared DNA library, DNA was fragmented with the Hydroshear1 (GeneMachines) into pieces with a distribution centered on 1.5 kb. Blunt-ended DNA was then ligated into pPCR-Script Amp Sk(+) plasmid (Stratagene), and plasmid DNA was introduced into chemically competent XL10-Gold Kan E. coli following the manufacturer’s procedures (Stratagene). Insert-carrying plasmids were identified in transformants by detecting white colonies after growth on X-Gal/IPTG (Sambrook et al., 1989).

2.3.3 Sequencing and sequence analysis Colonies were grown overnight at 37°C in single wells of 96-well microtiter plates containing 150 µl LB freeze (4 mM MgSO4, 360 mM K2HPO4, 132 mM KH2PO4, 17 mM Na-citrate, 68 mM (NH4)2SO4, 4.4% glycerol in LB, pH 7.0) and 100 µg ml-1

51

ampicillin and transferred to LB agar plates containing 100 µg ml-1 ampicillin after 18 h using a 96-well plate replicator. The inoculated agar plates were sent to MWG-Biotech (High Point, NC, USA) for one-pass sequencing of the inserts using the M13 forward and reverse primers for pUC18, and T7 and T3 primers for pPCR-Script Amp Sk(+) plasmids on an ABI377 automatic sequencer. Trace files were analyzed with the PHRED and CROSS_MATCH algorithms of MacPhred/Phrap (Ewing et al., 1998; Ewing and Green, 1998) to translate the ABI377 chromatogram data of the sequence files into accurate quality information for each base call and detection of plasmid sequences, respectively. Plasmid sequences were removed from each sequence tag and high quality sequence data (s20 phred score) were collected into a database and searched against the non-redundant (nr) database at National Center for Biotechnology Information (NCBI) using nucleotidenucleotide BLAST (blastn) or the translating BLAST (blastx) algorithms (Altschul et al., 1990). To screen for redundant sequence tags, the S. kunkelii sequence database was also searched against itself with the blastn algorithm. Nucleotide sequences with significant similarities (E-value ≤ 10-5) to sequences in the NCBI database were collected, translated into proteins and searched against the full non-redundant protein database and nonredundant databases of Mycoplasmataceae and Bacillus/Clostridium at NCBI with the protein-protein BLAST (blastp) algorithm. All sequence analyses were performed on local Linux workstations.

52

2.4 Results and discussion 2.4.1 Library construction, sequencing and sequence analysis To confirm the identity of the isolated DNA, the spiralin gene was amplified using primers described by Foissac et al. (Foissac et al., 1997). The nucleotide sequence of the spiralin gene amplification product was identical to that reported earlier (Foissac et al., 1997), thus confirming the identity of the CSS-M clones of S. kunkelii (data not shown). Insert sizes of clones from the shotgun library ranged from 0.5 to 4 kb and clones from the EcoRI or HindIII libraries contained fragment sizes ranging from 150 bp to 10 kb. In total, 94 inserts from the EcoRI and HindIII libraries, and 188 inserts from the sheared DNA library were sequenced from flanking primer sites after which the sequences were collected into a database. Low quality and cloning vector sequences were removed from the database. The database was then searched against itself with the blastn algorithm to analyze redundancy. Mollicute genome sequences show the presence of highly repeated regions and spiroplasma genomes have many copies of spiroplasma phage SpV1 DNA (Bébéar et al., 1996). Therefore, redundant sequence tags were not assembled into contigs but within each set of redundant clones one sequence tag from the forward and reverse direction with best phred quality scores were kept in the database whereas others were removed. This resulted in a database of 144,712 nucleotides (396 sequence tags) of non-redundant high-quality (s20 phred score) S. kunkelii genome sequences representing 9% of the S. kunkelii genome, based on an estimated genome size of 1,600 kb (Bové, 1997). All sequences were deposited in the random single pass read

53

genome survey sequence database (dbGSS) of GenBank (Accession Nos. BH234783 to BH235178). The 396 sequence tags were searched against the complete NCBI nr database with the blastn and blastx algorithms. The overall percentage of sequence tags with significant similarity (E-value ≤ 10-5) to open reading frames (ORFs) in the NCBI nr database was ~40% (150/396), which is in agreement with previous findings that biological functions can be assigned to ~50% of the ORFs in completed genome sequencing projects (Simpson et al., 2000).

2.4.2 DNA phage sequences Unlike the Mycoplasmataceae, spiroplasma genomes harbor many spiroplasma phage SpV1 DNA insertions (Ye et al., 1994; Ye et al., 1995; Fraser et al., 1995; Bébéar et al., 1996; Himmelreich et al., 1996; Glass et al., 2000; Chambaud et al., 2001). In this survey ~5% (17/396) of the S. kunkelii sequence tag database had significant similarity to spiroplasma virus SpV1 DNA (Table 2.1). The percentage of phage sequences in the S. kunkelii sequence tag database is comparable to the 7% DNA phage sequences found in the genome of the Gram-negative leafhopper-transmitted vascular plant pathogen, Xylella fastidiosa (Simpson et al., 2000).

2.4.3 Spiroplasma-specific sequences In total, 133 sequence tags had significant similarities to prokaryotic and/or eukaryotic protein sequences in the NCBI nr database. Included were four sequences

54

unique to spiroplasmas with similarity to putative S. citri virulence genes encoding P123, P58, P54, or P18 (Table 2.1) (Ye et al., 1996; Fletcher et al., 1998). These genes are part of a 9.5-kb S. citri genome segment that is deleted from a non-transmissible line of S. citri.

2.4.4 Comparative genome analysis As a preliminary assessment of to what extent the S. kunkelii genome content differs from those of Mycoplasmataceae species, sequence tags were translated into proteins to ensure that the sequences were part of ORFs and, subsequently, the protein sequences were searched against the Mycoplasmataceae, Bacillus/Clostridium and complete nr protein databases of GenBank (Table 2.2). The Mycoplasmataceae database was selected because it contains the full genome sequences of three mycoplasma and one ureaplasma species (Fraser et al., 1995; Himmelreich et al., 1996; Glass et al., 2000; Chambaud et al., 2001), whereas the Bacillus/Clostridium database was selected because it contains many completed genome sequences and Bacillus/Clostridium species are thought to be closest walled relatives to Mollicutes (Razin, 1994; Bové and Garnier, 1997). Interesting gene content differences among S. kunkelii, and Mycoplasmataceae and Bacillus/Clostridium species are discussed below.

2.4.5 Amino acid, purine, pyrimidine, nucleoside and nucleotide metabolism Mycoplasmataceae species lack most genes involved in de novo biosynthesis of pyrimidines, purines and amino acids (Fraser et al., 1995; Himmelreich et al., 1996; Glass

55

et al., 2000; Chambaud et al., 2001). However, in contrast to mycoplasmas and U. urealyticum, the S. kunkelii genome seems to harbor the nucleotide and/or amino acid biosynthesis genes encoding adenylosuccinate lyase, adenylosuccinate synthase, GMP synthase, deoxyguanosine kinase, and folylpolyglutamate synthase/dihydrofolate synthetase (folC) (Table 2.2). Adenylosuccinate lyase is a tetrameric enzyme involved in de novo synthesis of inosine monophosphate (IMP) and adenosine monophosphate (Mantsala and Zalkin, 1992), adenylosuccinate synthase catalyzes the first step in de novo biosynthesis of AMP (Honzatko and Fromm, 1999), and guanine monophosphate (GMP) synthase catalyzes the last step from IMP into GMP (Mantsala and Zalkin, 1992). Deoxyadenosine/deoxyguanosine kinase and deoxyadenosine/deoxycytidine kinase are required, together with thymidine kinase, for deoxynucleotide synthesis in Lactobacillus acidophilus (Ma et al., 1995). Interestingly, the deoxyguanosine kinase gene is present in the mollicute Mycoplasma mycoides. Within the order Mycoplasmatales, M. mycoides belongs to the Entomoplasmataceae, a family more closely related to the Spiroplasmataceae than the Mycoplasmataceae (Bové and Garnier, 1997). The folC gene product is essential for production of glycine, methionine, purine and thymidine (Singer et al., 1985). These data suggest that S. kunkelii can synthesize more amino acids and nucleotides de novo than Mycoplasmataceae species do, which is in agreement with experimental evidence that spiroplasma culturing media are less complex than those of the culturable mycoplasmas (Lee and Davis, 1989; Razin, 1994).

56

2.4.6 Cell envelope The sequence data indicate that S. kunkelii harbors at least two cell envelope biosynthesis genes that are absent from members of the Mycoplasmataceae. The gcpE gene is involved in the acetylation of peptidoglycans and isoprenoid biosynthesis and is broadly distributed in eubacteria and plants (Rather et al., 1997; Campos et al., 2001). MreB is a cytoskeletal protein and forms a filamentous helical structure close to the cell surface of eubacteria, and has an actin-like role in bacterial cell morphogenesis (Jones et al., 2001). The clear morphological differences between spiroplasmas and Mycoplasmataceae and our finding that the mreB gene is absent from Mycoplasmataceae genomes but present in S. kunkelii suggest that MreB may have a critical role in the unique helical cell structure of spiroplasmas.

2.4.7 Regulatory functions Our sequence data show that the S. kunkelii regulatory mechanisms are more complex that those of the Mycoplasmataceae. Three genes were identified encoding the regulatory proteins NifR3, SinR and PNPase that were absent in the three sequenced mycoplasmas and U. urealyticum but present in Firmicutes. NifR3 is important for the regulation of the dormant and vegetative cell stages of the ciliate Sterkiella histriomuscorum (Tourancheau et al., 1999). The function of NifR3 in bacteria is not known. SinR is involved in the transition of a vegetative stage to sporulation in Bacillus subtilis in response to nutrient depletion (Gaur et al., 1991). Spiroplasmas do not make spores, but are extremely pleomorphic. It is tempting to speculate that NifR3 and SinR

57

may be involved in S. kunkelii cell shape regulation as a response to nutrient availability. A third regulatory protein, polynucleotide phosphorylase (PNPase) is responsible for mRNA decay, translation activation and transcript stabilization in B. subtilis (Wang and Bechhofer, 1996; Oussenko and Bechhofer, 2000). The loss of PNPase is lethal for E. coli, but affects only competence development in B. subtilis (Donovan and Kushner, 1986; Luttinger et al., 1996) and may affect competence of S. kunkelii as well. The discovery of these regulatory factors in S. kunkelii is surprising as, thus far, members of the Mycoplasmataceae are known to lack major regulators of gene expression (Fraser et al., 1995; Himmelreich et al., 1996; Himmelreich et al., 1997; Weiner et al., 2000).

2.4.8 Replication One surprising finding was that the DNA polymerase I protein of S. kunkelii did not match the DNA polymerases of Mycoplasmataceae, whereas it had significant similarity to the DNA polymerase I proteins of Streptococcus species (E-values: 2e-35 and 2e-32, sequence tag MEAA_B05.y, Table 2.2). Closer analysis revealed that the 193 amino acid sequence tag of S. kunkelii was similar to the C-terminal polymerase domain of DNA polymerase I. In contrast, putative DNA polymerases I of M. genitalium (GenBank Accession No. I64228), M. pneumoniae (S73784), U. urealyticum (C82895) and M. pulmonis (CAC13893) are ~300 amino acids in size and consist of the N-terminal 5’-3’ exonuclease part (proofreading) part but lack the C-terminal 3’-5’ exonuclease and polymerase domains (Klenow fragment) of the enzyme (Chambaud et al., 2001). This finding suggests that, unlike mycoplasmas and U. urealyticum, the S. kunkelii polA gene

58

may encode the full-length DNA polymerase I protein including the proofreading and Klenow domains similarly to that of Streptococcus pneumoniae (Lopez et al., 1989).

2.4.9 Transport and binding proteins In contrast to Mycoplasmataceae, the S. kunkelii genome harbors at least one copy of a traK homologue. S. kunkelii traK has the highest similarity to traK of the B. anthracis virulence plasmid pX02.09 (Table 2.2) (Okinaka et al., 1999). This conserved protein family binds DNA and couples plasmid to membrane proteins for transport to the mating cell and/or are pathogenicity factors involved in transport of virulence factors to the extracellular environment of bacteria (Errington et al., 2001; Christie and Vogel, 2000). The function of S. kunkelii TraK protein remains to be investigated. Two S. kunkelii sequence tags (MSAC_C02.x and MSAD_C02.y, Table 2.2) harbor sequences similar to fructose permease of the phosphoenolpyruvate:fructose phosphotransferase system (fructose PTS). Mutagenesis of the operon encoding fructose PTS proteins in another leafhopper-transmitted plant-pathogenic spiroplasma, S. citri, significantly decreases plant pathogenicity (Gaurivaud et al., 2000). The most likely explanation is that utilization of fructose in the plant sieve tubes by S. citri may interfere with the normal physiology of the plant causing chlorosis, stunting and wilting (Gaurivaud et al., 2000). This may be true for S. kunkelii in sieve tubes of corn plants as well. Homologues of fructose PTS proteins were also identified in Mycoplasmataceae and other Firmicutes (Table 2.2).

59

2.4.10 Genes in other categories The S. kunkelii genome harbors at least one copy of a spoIIIE homologue that is not found in the Mycoplasmataceae genome sequenced so far (sequence tag MEAA_A06.y, Table 2.2). In B. subtilis, the spoIIIE gene product is involved in the coordination of chromosome segregation and clearing DNA from the site of division during septum formation (Bath et al., 2000) and, therefore, is likely to be involved in S. kunkelii cell division. A nifU-like gene of 228 nucleotides in length was identified in this sequencing project and harbors solely the C-terminal conserved domain containing two conserved cysteines, whereas functional iron-sulfur cluster-binding NifU proteins contain additional middle domains with four conserved cysteines (sequence tag MEAA_D11.x, Table 2.2) (Ouzounis et al., 1994; Nishio and Nakai, 2000; Agar et al., 2000). Several smaller nifU-like genes are also found in the nitrogen fixing Rhodobacter and Azotobacter species and single gene mutagenesis studies show that they are not essential for survival or nitrogen fixation of bacteria (Masepohl et al., 1993). The functions of these shorter nifU-like genes are not known. A sequence similar to the oxygen-insensitive NAD(P)H nitroreductase was found in the S. kunkelii database (sequence tag MSAD_E03.x, Table 2.2). This enzyme catalyzes the reduction of a variety of nitroaromatic compounds to highly toxic metabolites (Bryant and DeLuca, 1991). Although absent from the mycoplasmas and U. urealyticum genomes, it is found in the small (~650 kb) genome of the insect vectored apple proliferation phytoplasma (gi405516) (Jarausch et al., 2000). It is noteworthy that

60

phytoplasmas are the only other group of Mollicutes that infect plants causing characteristic chlorosis and stunting symptoms. Two sequence tags have identity to the 20 kDa PsaD thiol peroxidase proteins of Streptococcus species (Kolenbrander et al., 1994; Novak et al., 1998). Tag MHAA_A09.x contains the N-terminal part of this protein, whereas MHAA_D09.x harbors the C-terminal end. In S. pneumoniae, the psaD gene is located downstream from the psa locus with the psaA, psaB and psaC genes encoding an ABC-type Mn permease complex (Novak et al., 1998). Mutagenesis of each of four psa genes resulted in penicillin tolerance, defective adhesion and reduced transformation efficiency of S. pneumoniae (Novak et al., 1998). The psaA gene encodes an adhesin-like surface protein, and psaA and psaD related genes were identified in Streptococcus sanguis, Streptococcus parasanguis and Streptococcus gordonii (Kolenbrander et al., 1994). Several sequence tags have identity to conserved hypothetical proteins that are lacking from the mycoplasmas and U. urealyticum genomes sequenced thus far (Other categories, Table 2.2). We found only one sequence tag with identity to Mollicute sequences but not those of the Bacillus/Clostridium group (sequence tag PH_05.y, Table 2.2). The deduced protein sequence of this tag is a homologue of a hypothetical protein encoded by a gene in the downstream region of the fibril gene region of S. citri (Williamson et al., 1991). The fibril protein is important for the helical cell shape and motility of spiroplasmas (Trachtenberg, 1998; Trachtenberg and Gilad, 2001) and the gene encoding it is lacking from the genomes of the oval-shaped mycoplasmas and U. urealyticum (Fraser et al., 1995; Himmelreich et al., 1996; Glass et al., 2000; Chambaud

61

et al., 2001). Because the hypothetical protein gene is localized near the fibril protein gene (Williamson et al., 1991) and is unique to Mollicutes (Table 2.2), this hypothetical protein may be an important constituent of the mollicute cytoskeleton.

2.4.11 Ribosomal RNA genes Clones MEAA_E09 and MHAA_F02 contained part of the 16S and 23S ribosomal RNA (rRNA) genes and the 16S-23S internal spacer with closest similarity to rRNA gene regions from S. citri, as is expected from the S. kunkelii phylogenetic position (Bové and Garnier, 1997) (Table 2.3). S. kunkelii rRNA genes have not been sequenced previously.

2.4.12 Conclusions In summary, our data show that, in addition to the large number of spiroplasma phage DNA insertions, S. kunkelii also harbors more amino acid and nucleotide biosynthesis, transcription regulation, cell envelope and DNA transport/binding genes than the genomes of the Mycoplasmataceae species do. Our data also demonstrate that genome comparisons among Mollicutes are extremely informative because of their small genome sizes, broad host range, differences in morphology, and well-defined biology. In addition to the already completed genome sequences of four Mycoplasmataceae species, several genome sequence projects of Mollicutes in other families are ongoing including those of M. mycoides and M. capricolum in the family Entomoplasmataceae (http://www.ncbi.nlm.nih.gov/PMGifs/Genomes/bact.html), and S. kunkelii

62

(http://www.genome.ou.edu/spiro.html) and S. citri (http://www.cwu.edu/~verheys/s.citri/). Genome comparison of species within a family, among families within the class Mollicutes and between Mollicutes and Firmicutes should prove extremely valuable.

2.5 Acknowledgments The authors thank Dr. Margareth Redinbaugh for carefully reading the manuscript and Dr. Robert Davis for help with establishing S. kunkelii in vitro cultures at the OARDC. This research was funded by the OARDC research enhancement and competitive grants program.

2.6 References Agar, J.N., Yuvaniyama, P., Jack, R.F., Cash, V.L., Smith, A.D., Dean, D.R. and Johnson, M.K. (2000) Modular organization and identification of a mononuclear ironbinding site within the NifU protein. J. Biol. Inorg. Chem. 5, 167-177. Altschul, S.F., Gish, W., Miller, W., Myers, E.W. and Lipman, D.J. (1990) Basic local alignment search tool. J. Mol. Biol. 215, 403-410. Bath, J., Wu, L.J., Errington, J. and Wang, J.C. (2000) Role of Bacillus subtilis SpoIIIE in DNA transport across the mother cell prespore division septum. Science 290, 995997. Bébéar, C.-M., Aullo, P., Bové, J.M. and Renaudin, J. (1996) Spiroplasma citri virus SpV1: characterization of viral sequences present in the spiroplasmal host chromosome. Curr. Microbiol. 32, 134-140. Bové, J.M. (1997) Spiroplasmas: infectious agents of plants, arthropods and vertebrates. Wien. Klin. Wochenschr. 109, 604-612.

63

Bové, J.M. and Garnier, M. (1997) In: Developments in Plant Pathology, Pathogen and Microbial Contamination Management in Micropropagation, Vol. 12 (Cassels, A.C., Ed.), pp. 45-60. Kluwer Academic Publishers, Dordrecht. Bryant, C. and DeLuca, M. (1991) Purification and characterization of an oxygeninsensitive NAD(P)H nitroreductase from Enterobacter cloacae. J. Biol. Chem. 266, 4119-4125. Campos, N., Rodriguez-Concepcion, M., Seemann, M., Rohmer, M. and Boronat, A. (2001) Identification of gcpE as a novel gene of the 2-C-methyl-D-erythritol 4phosphate pathway for isoprenoid biosynthesis in Escherichia coli. FEBS Lett. 488, 170-173. Chambaud, I., Heilig, R., Ferris, S., Barbe, V., Samson, D., Galisson, F., Moszer, I., Dybvig, K., Wroblewski, H., Viari, A., Rocha, E.P. and Blanchard, A. (2001) The complete genome sequence of the murine respiratory pathogen Mycoplasma pulmonis. Nucleic Acids Res. 29, 2145-2153. Christie, P.J. and Vogel, J.P. (2000) Bacterial type IV secretion: conjugation systems adapted to deliver effector molecules to host cells. Trends Microbiol. 8, 354-360. Donovan, W.P. and Kushner, S.R. (1986) Polynucleotide phosphorylase and ribonuclease II are required for cell viability and mRNA turnover in Escherichia coli K-12. Proc. Natl. Acad. Sci. USA 83, 120-124. Ebbert, M.A. and Nault, L.R. (1994) Improved overwintering ability in Dalbulus maidis (Homoptera: Cicadellidae) vectors infected with Spiroplasma kunkelii (Mycoplasmatales: Spiroplasmataceae). Environ. Entomol. 23, 634-644. Ebbert, M.A. and Nault, L.R. (2001) Survival in Dalbulus leafhopper vectors improves after exposure to maize stunting pathogens. Entomol. Exp. Appl. 100, 311-324. Errington, J., Bath, J. and Wu, L.J. (2001) DNA transport in bacteria. Nat. Rev. Mol. Cell. Biol. 2, 538-545. Ewing, B. and Green, P. (1998) Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res. 8, 186-194. Ewing, B., Hillier, L., Wendl, M.C. and Green, P. (1998) Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res. 8, 175-185. Fletcher, J., Wayadande, A., Melcher, U. and Ye, F. (1998) The phytopathogenic mollicute-insect vector interface: a closer look. Phytopathology 88, 1351-1358.

64

Fraser, C.M., Gocayne, J.D., White, O., Adams, M.D., Clayton, R.A., Fleischmann, R.D., Bult, C.J., Kerlavage, A.R., Sutton, G., Kelley, J.M., Fritchman, J.L., Weidman, J.F., Small, K.V., Sandusky, M., Fuhrmann, J.L., Nguyen, D.T., Utterback, T., Saudek, D.M., Phillips, C.A., Merrick, J.M., Tomb, J., Dougherty, B.A., Bott, K.F., Hu, P.C., Lucier, T.S., Peterson, S.N., Smith, H.O. and Venter, J.C. (1995) The minimal gene complement of Mycoplasma genitalium. Science 270, 397-403. Foissac, X., Bové, J.M. and Saillard, C. (1997) Sequence analysis of Spiroplasma phoeniceum and Spiroplasma kunkelii spiralin genes and comparison with other spiralin genes. Curr. Microbiol. 35, 240-243. Gaur, N.K., Oppenheim, J. and Smith, I. (1991) The Bacillus subtilis sin gene, a regulator of alternate developmental processes, codes for a DNA-binding protein. J. Bacteriol. 173, 678-686. Gaurivaud, P., Danet, J.L., Laigret, F., Garnier, M. and Bové, J.M. (2000) Fructose utilization and phytopathogenicity of Spiroplasma citri. Mol. Plant-Microbe Interact. 13, 1145-1155. Glass, J.I., Lefkowitz, E.J., Glass, J.S., Heiner, C.R., Chen, E.Y. and Cassell, G.H. (2000) The complete sequence of the mucosal pathogen Ureaplasma urealyticum. Nature 407, 757-762. Himmelreich, R., Hilbert, H., Plagens, H., Pirkl, E., Li, B.C. and Herrmann, R. (1996) Complete sequence analysis of the genome of the bacterium Mycoplasma pneumoniae. Nucleic Acids Res. 24, 4420-4449. Himmelreich, R., Plagens, H., Hilbert, H., Reiner, B. and Herrmann, R. (1997) Comparative analysis of the genomes of the bacteria Mycoplasma pneumoniae and Mycoplasma genitalium. Nucleic Acids Res. 25, 701-712. Honzatko, R.B. and Fromm, H.J. (1999) Structure-function studies of adenylosuccinate synthase from Escherichia coli. Arch. Biochem. Biophys. 370, 1-8. Jarausch, W., Saillard, C., Helliot, B., Garnier, M. and Dosba, F. (2000) Genetic variability of apple proliferation phytoplasmas as determined by PCR-RFLP and sequencing of a non-ribosomal fragment. Mol. Cell. Probes 14, 17-24. Jones, L.J., Carballido-Lopez, R. and Errington, J. (2001) Control of cell shape in bacteria: helical, actin-like filaments in Bacillus subtilis. Cell 104, 913-922. Kolenbrander, P.E., Andersen, R.N. and Ganeshkumar, N. (1994) Nucleotide sequence of the Streptococcus gordonii PK488 coaggregation adhesin gene, scaA, and ATPbinding cassette. Infect. Immun. 62, 4469-4480. 65

Lee, I.-M. and Davis, R.E. (1989) Serum-free media for cultivation of spiroplasmas. Can. J. Microbiol. 35, 1092-1099. Lopez, P., Martinez, S., Diaz, A., Espinosa, M. and Lacks, S.A. (1989) Characterization of the polA gene of Streptococcus pneumoniae and comparison of the DNA polymerase I it encodes to homologous enzymes from Escherichia coli and phage T7. J. Biol. Chem. 264, 4255-4263. Luttinger, A., Hahn, J. and Dubnau, D. (1996) Polynucleotide phosphorylase is necessary for competence development in Bacillus subtilis. Mol. Microbiol. 19, 343-356. Ma, G.T., Hong, Y.S. and Ives, D.H. (1995) Cloning and expression of the heterodimeric deoxyguanosine kinase/deoxyadenosine kinase of Lactobacillus acidophilus R-26. J. Biol. Chem. 270, 6595-6601. Mantsala, P. and Zalkin, H. (1992) Cloning and sequence of Bacillus subtilis purA and guaA, involved in the conversion of IMP to AMP and GMP. J. Bacteriol. 174, 18831890. Markham, P.G. (1983) Spiroplasmas in leafhoppers: a review. Yale J. Biol. Med. 56, 745-751. Masepohl, B., Angermuller, S., Hennecke, S., Hubner, P., Moreno- Vivian, C. and Klipp, W. (1993) Nucleotide sequence and genetic analysis of the Rhodobacter capsulatus ORF6-nifUI SVW gene region: possible role of NifW in homocitrate processing. Mol. Gen. Genet. 238, 369-382. Nault, L.R. (1980) Maize bushy stunt and corn stunt: a comparison of disease symptoms, pathogen host ranges and vectors. Phytopathology 70, 709-712. Nishio, K. and Nakai, M. (2000) Transfer of iron-sulfur cluster from NifU to apoferredoxin. J. Biol. Chem. 275, 22615-226158. Novak, R., Braun, J.S., Charpentier, E. and Tuomanen, E. (1998) Penicillin tolerance genes of Streptococcus pneumoniae: the ABC-type manganese permease complex Psa. Mol. Microbiol. 29, 1285-1296. Okinaka, R., Cloud, K., Hampton, O., Hoffmaster, A., Hill, K., Keim, P., Koehler, T., Lamke, G., Kumano, S., Manter, D., Martinez, Y., Ricke, D., Svensson, R. and Jackson, P. (1999) Sequence, assembly and analysis of pX01 and pX02. J. Appl. Microbiol. 87, 261-262.

66

Oussenko, I.A. and Bechhofer, D.H. (2000) The yvaJ gene of Bacillus subtilis encodes a 3P-to-5P exoribonuclease and is not essential in a strain lacking polynucleotide phosphorylase. J. Bacteriol. 182, 2639- 2642. Ouzounis, C., Bork, P. and Sander, C. (1994) The modular structure of NifU proteins. Trends Biochem. Sci. 19, 199-200. Purcell, A.H. (1982) Insect vector relationship with procaryotic plant pathogens. Annu. Rev. Phytopathol. 20, 397-417. Rather, P.N., Solinsky, K.A., Paradise, M.R. and Parojcic, M.M. (1997) aarC, an essential gene involved in density-dependent regulation of the 2P-N-acetyltransferase in Providencia stuartii. J. Bacteriol. 179, 2267-2273. Razin, S. (1994) DNA probes and PCR in diagnosis of mycoplasma infections. Mol. Cell. Probes 8, 497-511. Razin, S., Yogev, D. and Naot, Y. (1998) Molecular biology and pathogenicity of mycoplasmas. Microbiol. Mol. Biol. Rev. 62, 1094-1156. Sambrook, J., Fritsch, E.F. and Maniatis, T. (1989) Molecular Cloning: a Laboratory Manual, 2nd edn. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. Simpson, A.J., Reinach, F.C., Arruda, P., Abreu, F.A., Acencio, M., Alvarenga, R., Alves, L.M., Araya, J.E., Baia, G.S., Baptista, C.S., Barros, M.H., Bonaccorsi, E.D., Bordin, S., Bové, J.M., Briones, M.R., Bueno, M.R., Camargo, A.A., Camargo, L.E., Carraro, D.M., Carrer, H., Colauto, N.B., Colombo, C., Costa, F.F., Costa, M.C., Costa-Neto, C.M., Coutinho, L.L., Cristofani, M., Dias-Neto, E., Docena, C., ElDorry, H., Facincani, A.P., Ferreira, A.J., Ferreira, V.C., Ferro, J.A., Fraga, J.S., Franca, S.C., Franco, M.C., Frohme, M., Furlan, L.R., Garnier, M., Goldman, G.H., Goldman, M.H., Gomes, S.L., Gruber, A., Ho, P.L., Hoheisel, J.D., Junqueira, M.L., Kemper, E.L., Kitajima, J.P., Krieger, J.E., Kuramae, E.E., Laigret, F., Lambais, M.R., Leite, L.C., Lemos, E.G., Lemos, M.V., Lopes, S.A., Lopes, C.R., Machado, J.A., Machado, M.A., Madeira, A.M., Madeira, H.M., Marino, C.L., Marques, M.V., Martins, E.A., Martins, E.M., Matsukuma, A.Y., Menck, C.F., Miracca, E.C., Miyaki, C.Y., Monteriro-Vitorello, C.B., Moon, D.H., Nagai, M.A., Nascimento, A.L., Netto, L.E., Nhani Jr., A., Nobrega, F.G., Nunes, L.R., Oliveira, M.A., de Oliveira, M.C., de Oliveira, R.C., Palmieri, D.A., Paris, A., Peixoto, B.R., Pereira, G.A., Pereira Jr., H.A., Pesquero, J.B., Quaggio, R.B., Roberto, P.G., Rodrigues, V., de, M.R.A.J., de Rosa Jr., V.E., de Sa, R.G., Santelli, R.V.,Sawasaki, H.E., da Silva, A.C., da Silva, A.M., da Silva, F.R., da Silva Jr., W.A., da Silveira, J.F., Silvestri, M.L., Siqueira, W.J., de Souza, A.A., de Souza, A.P., Terenzi, M.F., Tru., D., Tsai, S.M., Tsuhako, M.H., Vallada, H., Van Sluys, M.A., Verjovski-Almeida, S., Vettore, A.L., Zago,

67

M.A., Zatz, M., Meidanis, J. and Setubal, J.C. (2000) The genome sequence of the plant pathogen Xylella fastidiosa. Nature 406, 151-157. Singer, S., Ferone, R., Walton, L. and Elwell, L. (1985) Isolation of a dihydrofolate reductase-deficient mutant of Escherichia coli. J. Bacteriol. 164, 470-472. Tourancheau, A.B., Morin, L., Yang, T. and Perasso, R. (1999) Messenger RNA in dormant cells of Sterkiella histriomuscorum (Oxytrichiade): identification of putative regulatory gene transcripts. Protist 150, 137-147. Trachtenberg, S. (1998) Mollicutes - wall-less bacteria with internal cytoskeletons. J. Struct. Biol. 124, 244-256. Trachtenberg, S. and Gilad, R. (2001) A bacterial linear motor: cellular and molecular organization of the contractile cytoskeleton of the helical bacterium Spiroplasma melliferum BC3. Mol. Microbiol. 41, 827-848. Wang, W. and Bechhofer, D.H. (1996) Properties of a Bacillus subtilis polynucleotide phosphorylase deletion strain. J. Bacteriol. 178, 2375- 2382. Weiner III, J., Herrmann, R. and Browning, G.F. (2000) Transcription in Mycoplasma pneumoniae. Nucleic Acids Res. 28, 4488-4496. Williamson, D.L., Renaudin, J. and Bové, J.M. (1991) Nucleotide sequence of the Spiroplasma citri fibril protein gene. J. Bacteriol. 173, 4353-4362. Ye, F., Laigret, F. and Bové, J.M. (1994) A physical and genomic map of the prokaryote Spiroplasma melliferum and its comparison with the Spiroplasma citri map. C. R. Acad. Sci. 317, 392-398. Ye, F., Laigret, F., Carle, P. and Bové, J.M. (1995) Chromosomal heterogeneity among various strains of Spiroplasma citri. Int. J. Syst. Bacteriol. 45, 729-734. Ye, F., Melcher, U., Rascoe, J.E. and Fletcher, J. (1996) Extensive chromosome aberrations in Spiroplasma citri strain BR3. Biochem. Genet. 34, 269-286.

68

Identity Spiroplasma virus SpV1 ORFs ORF1, capsid protein

Sequence tag ID

Acc. No. of best hit

E-value

ORFa

MSAC_D10.y MSAD_C09.y MSAD_C12.x MSAD_B02.x MEAA_B07.y MHAA_H01.y MSAC_A12.y MSAD_C03.y MSAC_C02.y MSAC_D02.y MSAD_H02.y MSAD_H02.x MSAD_A08.x MHAA_C12.y MHAA_G02.y MSAD_E11.x MHAA_H07.x

9626113 1143020 9626113 1143020 1143021 9626114 1143018 1143013 9626110 9626110 P15893 U28974 9626111 P15898 1143012 U28972 1143008

2e-19 4e-94 3e-62 e-117 2e-39 3e-14 1e-14 1e-46 2e-26 2e-26 2e-42 3e-21 3e-12 3e-10 1e-09 8e-08 3e-29

S. citri putative virulence proteins P123 P58 P54 P18

MSAD_A01.x MHAA_A07.x MHAA_A07.y MHAA_F03.y

T28663 7482012 7482011 7482010

2e-61 4e-22 1e-64 2e-24

ORF3, transposase gene

ORF2 ORF4 ORF5 ORF7 ORF14

Table 2.1 Sequence tags with significant similarity (E-value ≤ 10-5) to spiroplasma virus SpV1 and S. citri putative virulence proteins

Deduced protein sequences were searched against the non-redundant database at NCBI. Identity and sequence tag identity (ID) are indicated and for each sequence tag the accession number (Acc. No.) and Evalue of entry with the highest similarity are listed.

69

Sequence tag ID

Identity

Amino acid biosynthesis MEAA_A03.y Thymidylate kinase MSAC_C05.y Folylpolyglutamate synthase/dihydrofolate synthetase (folC) MSAC_B10.x Methionine aminopeptidase (MAP) (peptidase M) PE_10.x Serine hydroxymethyl transferase (glyA) Cell envelope MHAA_E07.x PH_05.x

Cell shape determining protein (MreB-like protein) GcpE protein

Fatty acid and phospholipid metabolism MEAA_D03.x Probable N-acetylglucosamine-6-phosphate deacetylase MHAA_C12.x 1-Acyl-sn-glycerol-3-phosphate acyltransferace MHAA_H07.x orfa HSAD_G09.y 1-Acyl-sn-glycerol-3-phosphate acyltransferace

Sequence length (aa)

Best entry Mycoplasmataceae, accession No. (E-value)

Best entry Bacillus/Clostridium, accession No. (E-value)

Best entry GenBank organism, accession No. (E-value)

96 127

14089465 (5e-08) -

2632295 (3e-12) 4930039 (3e-09)

B. subtilis, 16077096 (2e-10) Str. pneumoniae, 15900133 (4e-09)

179 239

14089981 (3e-37) 1673936 (2e-78)

11131429 (3e-41) 12723496 (5e-84)

B. halodurans, 15612719 (1e-35) Str. pneumoniae, 15902972 (2e-64)

172 102

-

10176363 (2e-33) 1730252 (3e-26)

B. halodurans, 15616301 (4e-07) B. subtilis, 16079569 (1e-27)

73

14089782 (1e-06)

A69664 (4e-08)

St. aureus, 15893481 (9e-06)

112 65 117

13508038 (2e-10) 14089575 (1e-09) 13508038 (5e-22)

10174252 (0.008) 2633961 (8e-12) 2633289 (3e-05)

M. pulmonis, 15828585 (2e-06) S. citri, 1143008 (3e-29) M. pneumoniae, 13508038 (6e-20)

70

Cellular processes MEAA_C12.y MHAA_B03.x

DnaK protein (hsp 70) GTP-binding membrane protein (LepA)

189 80

8920287 (2e-51) 6899301 (2e-17)

P45554 (5e-58) 12724067 (1e-18)

E. rhusiopathiae, 1169374 (4e-58) La. lactis, 15673090 (2e-05)

Energy metabolism MEAA_B07.x MEAA_C10.x MHAA_E06.y MHAA_A06.y MHAA_D08.y

Dihydrolipoamide dehydrogenase Phosphomannomutase (PMM) Glycerol-3-phosphate dehydrogenase ATP synthase β chain Fructose-biphophate aldolase

43 133 167 213 107

1674136 (1e-10) 1352196 (3e-19) 14089537 (2e-15) 14089679 (1e-94) 12044873 (2e-23)

12722900 (3e-07) C69835 (8e-29) 1146220 (8e-27) 10176378 (1e-86) 10944298 (1e-21)

P. putida, 1706442 (7e-16) B. halodurans, 15613669 (8e-19) St. aureus, 15924464 (2e-24) M. pulmois, 15828737 (1e-76) Cl. acetobutylicum, 15894114 (2e-25)

(Continued) Table 2.2 Sequence tags with significant similarities (E-value ≤ 10-5) to NCBI nr protein sequences Deduced protein sequences were blastp searched against the non-redundant (nr) database and the Mycoplasmataceae and Bacillus/Clostridium protein databases at NCBI. Sequence tag identity (ID) and deduced amino acid (aa) length are indicated and for each sequence tag accession numbers and Evalues of entries with highest similarities are listed. The organism of entry with the highest similarity is listed for the nr database search results. A., Aquifex; B., Bacillus; C., Chlamydia; Ca., Campylobacter; Chl., Chlorobium; Cl., Clostridium; E., Erysipelothrix; En., Enterococcus; G., Geobacillus; L., Listeria; La., Lactococcus; Lac., Lactobacillus; M., Mycoplasma; My., Mycobacterium; P., Pseudomonas; S., Spiroplasma; St., Staphylococcus; Str., Steptococcus; T., Thermotoga; U., Urealyticum; V., Vibrio; X., Xylella; Y., Yersinia; -, no significant hit and sequence is absent; ns, E-value > 10-5 but sequence is present in genomes of one or more members of the Mycoplasmatacease or Bacillus/Clostridium group.

Table 2.2 (continued) Sequence tag ID

Identity

Sequence length (aa) 63 82 88 55

Best entry Mycoplasmataceae, accession No. (E-value) 14089925 (6e-08) 14089653 (1e-07) 2146068 (1e-11) 14089932 (2e-17)

Best entry Bacillus/Clostridium, accession No. (E-value) 7328298 (6e-06) ns 12061042 (0.001) 8670811 (8e-16)

Best entry GenBank organism, accession No. (E-value) M. pulmonis, 14089925 (7e-08) S. citri, 2384686 (3e-17) M. pneumoniae, 13508341 (1e-04) A. aeolicus, 6015091 (1e-14)

MHAA_F07.y MHAA_G06.x MSAC_G02.y MSAD_B06.x

Transketolase Pyruvate kinase ATP synthase β chain precursor Phosphopyruvate hydratase/enolase

188 123 131 106 113 144 149 89

14089558 (4e-16) D53312 (2e-24) 14089736 (2e-29) -

WZBSDS (2e-37) 10176653 (1e-30) 586859 (1e-08) 2636243 (4e-26) 10173982 (3e-19) 4033719 (3e-11) 12723536 (9e-33) 3483135 (2e-50)

Lac. sakei, 15727116 (1e-32) S. citri, 1709937 (1e-49) M. mycoides, 16040925 (7e-36) B. subtilis, 16080759 (9e-21) M. pirum, 1345713 (3e-22) M. mycoides, 16040925 (2e-32) T. maritima, 15644136 (2e-33) X. fastidiosa, 15839020 (6e-56)

Regulatory functions MHAA_A11.y RNA polymerase σ factor (RpoD)

114

12045103 (2e-06)

O66381 (1e-07)

MSAC_A08.x

121

-

10172709 (1e-29)

MSAD_H07.y

Transcriptional regulator involved in nitrogen regulation (NifR3 family) Predicted transcription regulator SinR

Cl. acetobutylicum, 15894582 (3e-06) B. halodurans, 15612660 (3e-28)

80

-

10174744 (3e-06)

Cl. acetobutylicum, 15894128 (3e-19)

Replication MEAA_B05.y MEAA_F12.x MHAA_G05.y

DNA-directed DNA polymerase I DNA gyrase subunit B Chain A, helicase product complex

193 104 171

14089786 (4e-13) 14090113 (7e-11)

A32949 (3e-35) 2558946 (4e-16) 2781090 (4e-17)

MHAA_H04.x MSAC_A09.x MSAC_B02.y MSAC_C06.y MSAD_B12.y MSAD_G04.y

DNA-directed RNA polymerase β subunit ParA family protein Glucose-inhibited division protein A Cell division protein FtsH DNA primase ATP-dependent helicase PcrA

96 141 78 174 96 112

600226 (5e-21) 12045330 (2e-06) 14089666 (2e-22) 14090194 (2e-43) 13508092 (1e-15) 14090183 (4e-15)

12724825 (3e-19) 9968459 (4e-09) P25812 (7e-18) S66099 (4e-42) 664755 (6e-25) P56255 (6e-16)

Str. pyogenes, 15674390 (2e-32) M. capricolum, 17008093 (2e-19) G. stearothermophilus, 9257172 (1e-18) S. citri, 1350848 (3e-46) S. citri, 10432498 (7e-16) M. pulmonis, 14089666 (3e-20) M. pulmonis, 15829250 (9e-44) L. innocua, 16800560 (4e-24) My. tuberculosis, 15840373 (2e-12)

Transcription antitermination factor (NusG) Polynucleotide phosphorylase (PNPase)

112

14089595 (1e-06)

O08386 (2e-19)

206

-

1184680 (3e-82)

Purine, pyrimidines, nucleosides, and nucleotides MEAA_B12.x Adenylosuccinate lysase MEAA_C08.x Adenylosuccinate synthetase MEAA_D08.y Deoxyguanosine kinase MHAA_B06.x Thymidine kinase MHAA_C05.y Cytidine deaminase MHAA_H10.y Deoxyguanosine kinase MSAC_D04.y Adenine phophoribosyltransferace MSAC_F08.x GMP synthetase (glutamine amindotransferase)

71

Transcription MHAA_E11.x MHAA_H05.y

L. monocytogenes, 16802292 (2e07) B. subtilis, 16078732 (4e-71)

(Continued)

Table 2.2 (continued) Sequence tag ID

Identity

Sequence length (aa)

MSAC_H02.y MSAD_B09.x

DNA-directed RNA polymerase α chain Transcription antitermination protein NusG ATP-dependent protease (lon-protease) 50S ribosomal protein L21 Valine-tRNA ligase Cysteinyl tRNA synthetase 50S ribosomal protein L3 50S ribosomal protein L2 Translation initiation factor 2 (InfB) Prolyl-tRNA synthetase Translation elongation factor G (EF-G) Hypothetical proteins similar to Osialoglycoprotein endopeptidase 50S ribosomal protein L4 Asparaginyl-tRNA synthetase Seryl-tRNA synthetase 50S ribosomal protein L5 30S ribosomal protein S3 30S ribosomal protein S8 50S ribosomal protein L17 (fragment) 50S ribosomal protein L19 Isoleucyl-tRNA synthetase Threonyl-tRNA synthetase Glutamyl-tRNA synthetase Phenylananyl-tRNA synthetase β chain Isoleucyl-tRNA synthetase DNA-directed DNA polymerase (α chain) Tryptophanyl-tRNA synthetase Glycyl-tRNA synthetase Heat shock protein GroEL Ribosomal large subunit pseudouridine synthase B Histidyl-tRNA synthetase Peptide chain release factor 1 (RF-1) 50S ribosomal protein L2 Glycyl-tRNA synthetase 30S ribosomal protein S17

Translation MEAA_B02.x MEAA_B09.x MEAA_B09.y MEAA_C04.y MEAA_C06.x MEAA_D03.y MEAA_D09.x MEAA_G12.y MHAA_A08.y MHAA_A11.x

72

MHAA_B08.y MHAA_C07.x MHAA_D07.y MHAA_C09.x MHAA_C09.y MHAA_C11.x MHAA_C11.y MHAA_D12.y MHAA_E03.y MHAA_E05.y MSAC_A11.x MSAC_B10.x MSAC_C04.y MSAC_H02.y MSAD_B06.y MSAD_B12.x MSAD_E10.x MSAD_H08.x PE_05.x PE_14.y PE_21.y PH_04.x PS_02.y

114 68

Best entry Mycoplasmataceae, accession No. (E-value) 6601578 (5e-22) ns

Best entry Bacillus/Clostridium, accession No. (E-value) 12725120 (1e-18) 12725158 (5e-09)

Best entry GenBank organism, accession No. (E-value) M. capricolum, 629301 (1e-28) Str. coelicolor, 1709420 (2e-06)

122 38 121 76 40 109 108 217 180 68

1674198 (8e-14) 14089744 (8e-05) 1351181 (3e-20) 1351147 (5e-12) 14090004 (4e-08) 14090000 (7e-36) 2497279 (5e-40) 14089596 (7e-63) 14089842 (5e-78) 14089531 (2e-18)

B42375 (1e-15) 12724034 (3e-08) 10175660 (9e-28) 12724882 (6e-12) P42920 (2e-05) P04257 (3e-36) 10175033 (3e-35) 13633967 (5e-64) 10172743 (3e-92) 1945110 (1e-19)

V. cholerae, 15641922 (1e-13) Str. pyrogenes, 15674860 (4e-08) St. aureus, 15927242 (2e-21) Cl. stricklandii, 6899996 (4e-10) M. capricolum, 132957 (2e-06) M. capricolum, 71083 (4e-20) M. genitalium, 12044994 (7e-49) B. burgdorferi, 15594747 (1e-50) B. halodurans, 15612694 (2e-80) St. aureus, 15927624 (8e-15)

38 119 90 88 59 85 119 66 101 132 131 175 180 107 145 201 83 95

2766504 (9e-36) 14090186 (1e-28) 1361847 (8e-26) 3844757 (2e-29) 3914904 (1e-04) 14089988 (4e-17) 14089975 (8e-27) 14089881 (1e-14) 14090082 (3e-12) 13508292 (1e-38) 13508417 (4e-16) ns 14090082 (2e-32) 6601579 (2e-25) 14090160 (7e-26) 6899491 (1e-63) 12045254 (4e-23) 14089751 (7e-04)

S24364 (3e-52) 12724857 (4e-14) 12724729 (9e-28) 4512416 (3e-33) ns P56209 (7e-17) P07843 (3e-31) 10175098 (6e-20) 437916 (2e-23) 143766 (8e-42) 289282 (4e-20) 40054 (3e-06) 10175165 (3e-41) 10172773 (2e-30) 10175491 (2e-34) 4584090 (1e-30) 12723267 (5e-35) 410137 (1e-15)

M. capricolum, 132981 (2e-45) Cl. acetobutylicum, 15896505 (1e-32) A. aeolicus, 15605830 (7e-24) B. halodurans, 15612709 (5e-27) S. citri, O31161 (7e-26) M. capricolum, 134021 (2e-23) M. capricolum, 7674204 (1e-38) B. halodurans, 15615041 (1e-18) St. aureus, 1174521 (1e-24) U. urealyticum, 13358098 (5e-45) B. subtilis, 16077160 (2e-15) C. pneumoniae, BAA98801.1 (4e-10) La. sakei, 15487790 (5e-27) M. capricolum, 629301 (1e-28) B. halodurans, 15615433 (1e-28) St. aureus, 15924555 (1e-55) En. faecalis, 15625350 (1e-33) B. subtilis, 466190 (1e-31)

220 144 195 68 85

12044885 (7e-29) 1350577 (5e-45) 14090000 (5e-66) 14089865 (1e-17) 14089993 (4e-22)

3915057 (4e-45) S55437 (2e-44) P04257 (5e-72) 4584090 (2e-24) P23828 (7e-31)

L. innocua, 16800623 (2e-07) M. capricolum, 2500137 (6e-55) M. capricolum, 71083 (7e-77) B. cereus, 4584090 (2e-17) S. citri, 3122807 (3e-39)

(Continued)

Table 2.2 (continued) Sequence tag ID

Sequence length (aa)

Best entry Mycoplasmataceae, accession No. (E-value)

Best entry Bacillus/Clostridium, accession No. (E-value)

Best entry GenBank organism, accession No. (E-value)

Transport and binding proteins MEAA_F10.x Phosphotransfereace EII (PTS system) MEAA_G05.y Phosphate ABC transporter, permease protein MEAA_H04.x ABC transporter MSAC_A07.y Methygalactosidase permease ATP-binding protein MSAC_A08.y ABC transporter, ATP-binding protein

138 177 168 229 70

14089430 (8e-14) 1361743 (4e-25) 2146659 (1e-51) 4914644 (3e-47) 12044917 (3e-15)

2633144 (6e-15) 4530449 (1e-30) 12723139 (8e-46) 12724309 (6e-47) 12724060 (1e-12)

MSAC_C02.x

Highly similar to phosphotransferace system (PTS) fructose-specific enzyme IIABC component ABC transporter, ATP-binding protein Highly similar to Mg(2+) transport ATPase

87

1045736 (1e-05)

2633811 (4e-09)

M. capricolum, 530422 (9e-15) V. vholerae, 15600843 (1e-23) U. urealyticum, 13358103 (2e-47) U. urealyticum, 13357571 (2e-36) Cl. acetobutylicum, 15894109 (2e-13) L. innocua, 16801491 (5e-07)

115 140

14089609 (2e-11) 14089568 (2e-15)

10173618 (3e-21) 12714231 (8e-31)

131 108

14090034 (7e-36) 14089827 (9e-17)

D70009 (5e-39) S11153 (5e-31)

111

14089430 (4e-32)

66867 (2e-21)

MSAD_D03.x MSAD_D12.y MSAD_E06.y MSAD_F05.x

Similar to ABC transporter (ATP-binding protein) Similar to ABC transporter ATP-binding protein – oligopeptide transport Phosphotransferase system, glucose-specific IIABC component Oligopeptide permease (ATP-binding protein) Transfer complex protein TrsK protein (TraK) Cation-transporting P-ATPase Phosphate ABC transporter, permease protein

50 75 125 96

13507956 (8e-07) 14089568 (3e-05) 13508349 (7e-14)

1420862 (2e-10) 6470167 (1e-09) 12724231 (4e-21) 4530449 (2e-20)

Str. pyogenes, 15674468 (6e-09) B. anthracis, 6470167 (1e-07) La. lactis, 15673239 (3e-16) Str. pneumoniae, 15901902 (8e19)

Other categories MEAA_A02.x MEAA_A03.x MEAA_A06.y

Amidase Conserved hypothetical protein SpoE family protein/cell division protein

62 165 141

2146059 (1e-06) 14090195 (1e-08) -

ns 467456 (6e-04) S09411 (2e-31)

MEAA_B03.y MEAA_D11.x MEAA_D12.y

Probable GTP-binding protein Nitrogen fixation protein NifU Predicted SAM-dependent methytransferase

56 69 145

13508214 (5e-12) 1045939 (7e-10)

1146219 (9e-17) 10176042 (6e-13) 12724027 (1e-16)

MEAA_E07.x MEAA_E08.y

RNA-binding Sun protein Conserved hypothetical protein

100 98

-

2633846 (4e-08) 10173873 (1e-21)

M. capricolum, 530426 (6e-10) U. urealyticum, 13357633 (2e-10) Str. pneumoniae, 15900761 (2e33) B. subtilis, 1730915 (6e-15) B. halodurans, 15615981 (7e-11) Cl. acetobutylicum, 12724027 (1e-16) B. subtilis, 16078637 (4e-04) St. aureus, 10173873 (1e-21)

MSAC_D03.y MSAC_D06.x MSAC_F10.y MSAD_A04.y

73

MSAD_C02.y

Identity

T. maritima, 15643786 (5e-19) L. monocytogenes, 16804726 (5e-20) B. subtilis, 16080207 (2e-38) Str. pneumoniae, 15903745 (3e30) M. pulmonis, 14089430 (4e-32)

(Continued)

Table 2.2 (continued) Sequence tag ID

Identity

Sequence length (aa) 112 125 56

Best entry Mycoplasmataceae, accession No. (E-value) S73881 (3e-05) 7109691 (5e-06)

Best entry Bacillus/Clostridium, accession No. (E-value) P54501 (1e-08) 10175125 (1e-07)

MEAA_E09.x MEAA_E12.x MEAA_F10.y

Hypothetical protein 199 aa conserved hypothetical protein Similar to putative phosphoprotein phosphatase

MHAA_A09.x

Probable thiol peroxidase

77

14090123 (6e-09)

P72500 (9e-12)

MHAA_A10.y MHAA_B12.x

Hypothetical protein Conserved GTP-binding protein

152 124

1674179 (7e-10) 14089767 (2e-09)

2634923 (7e-13) 12724592 (2e-08)

MHAA_B12.y

tRNA δ (2) isopentenylpyrophosphate transferase P115-like (Mycoplasma hyorhinis) ABC transporter ATP-binding protein Probable thiol peroxidase

75

-

13701103 (3e-11)

Chl. tepidum, 10039641 (2e-24) B. subtilis, P54501 (1e-08) L. monocytogenes, 10175125 (1e-06) Str. pneumoniae, 15901486 (1e09) A. aeolicus, 7451802 (2e-13) L. monocytogenes, 14089767 (2e-09) St. aureus, 15924294 (2e-11)

168

14090129 (8e-52)

10175107 (2e-46)

M. pulmonis, 14090129 (8e-52)

56

-

P31307 (5e-07)

118

14089726 (2e-39)

2619052 (6e-09)

109

ns

9968459 (3e-12)

87 134

14089574 (3e-09) 14090092 (7e-06)

12723043 (2e-10) 13700111 (2e-24)

MHAA_G08.y MHAA_H02.x MSAC_A06.x MSAC_B11.y MSAC_C09.y MSAC_E03.x

Acyl carrier protein phosphodiesterase (ACP phosphodiesterase) Partitioning or sporulation protein (ParA) (soj protein) Conserved hypothetical protein Probable type I restriction enzyme restriction chain Exodeoxyribonuclease V (α subunit) Conserved hypothetical protein Conserved hypothetical protein Hypothetical protein Conserved hypothetical protein Conserved hypothetical protein

Cl. acetobutylicum, 15896549 (5e-07) M. pulmonis, 14089726 (2e-39)

84 99 195 140 174 43

14090197 (2e-07) 3845056 (6e-16) 13508006 (5e-11) -

2635193 (4e-12) 2635763 (6e-29) 12724713 (3e-34) 12724031 (2e-19) 13027335 (5e-21) 7429432 (3e-05)

MSAC_H01.x

Conserved hypothetical protein

128

150165 (5e-09)

10175107 (3e-10)

MSAD_E03.x MSAD_F02.y

Nitroreductase Hypothetical protein

151 133

P75273 (2e-06)

7432647 (2e-05) 5420109 (3e-12)

MSAD_F01.x

Hypothetical 35.3 kDa protein, SLR1819

91

-

P37497 (1e-05)

MSAD_G02.x PH_01.y PH_05.y

Conserved hypothetical protein BH2145 – unknown conserved protein Hypothetical protein in fibril gene 3’ region

99 82 315

14090099 (3e-17) 12045292 (6e-14)

7328260 (8e-15) 10175035 (4e-19) -

MHAA_C06.y

74

MHAA_D09.x MHAA_E04.x MHAA_F06.y MHAA_F12.x MHAA_G03.y

Best entry GenBank organism, accession No. (E-value)

L. monocytogenes, 9968459 (3e-12) Str. pyogenes, 12723043 (2e-10) St. aureus, 15923185 (8e-27) C. pneumoniae, 15835659 (9e-13) St. aureus, 15923836 (2e-26) Y. pestis, 16121243 (2e-29) La. lactis, 12724031 (2e-19) St. aureus, 13027335 (5e-21) Synechocystis sp. PCC 680, 7444728 (4e-06) Str. pneumoniae, 10175107 (3e10) Ca. jejuni, 15792391 (2e-08) Str. thermophilus, 5420109 (3e12) Synechocystis sp. PCC 6803, P73709 (2e-07) M. pulmonis, 7328260 (8e-15) B. halodurans, 10175035 (4e-19) S. citri, P27712 (e-101)

Sequence tag ID MEAA_E09.x

Identity 16S rDNA

MHAA_F02.y

16S rDNA, 16S/23S spacer region, 23S rDNA

Accession No., organism 46914, S. citri 175961, S. poulsonii 175965, S. citri 175964, S. apis 175967, S. mirum 175969¸ S. monobiae 175962, S. taiwanense 175970, S. diabroticae 175963, S. gladiatoris 175473, Entomoplasma melaleucae 46914, S. citri

E-value 0.0 0.0 0.0 0.0 0.0 0.0 e-180 e-179 e-171 e-166 0.0

4456860, Spiroplasma sp. 2707198, S. citri 5821442, M. putrefaciens

e-151 e-125 2e-19

Table 2.3 S. kunkelii sequence tags with similarity to rRNA genes

S. kunkelii sequence tags were searched against the full GenBank nr nucleotide database with the blastn algorithm. Identities, accession numbers and organism, and E-values of the first 10 and four entries of respectively MEAA_E09.x and MHAA_F02.y, the only sequence tags with similarities to rRNA genes, are listed.

75

CHAPTER 3 Complete genome sequence of aster yellows witches' broom (AY-WB) phytoplasma and comparison with onion yellows (OY) phytoplasma Xiaodong Bai1, Jianhua Zhang1, Kiryl Tsuckerman3, Dimitry Schevchenko3, Eugene Goltsman3, Adam Ewing4, Sally A. Miller2, Theresa Walunas3, John Campbell3, and Saskia A. Hogenhout1 1

Department of Entomology, 2Department of Plant Pathology, The Ohio State University – Ohio Agricultural Research and Development Center (OARDC), Wooster, OH 44691 3 Integrated Genomics, Inc., Chicago, IL 60612 4 Department of Biology, Hiram College, Hiram, OH 44234

76

3.1 Abstract We determined the complete genome sequence of aster yellows witches' broom (AY-WB) phytoplasma, an intracellular bacterial plant pathogen transmitted by insects. The AY-WB phytoplasma genome consists of a 706,569-bp single circular chromosome and 4 plasmids with sizes of 3,972 bp, 4,009 bp, 5,104 bp, and 4,316 bp. The circular chromosome contains 673 predicted coding sequences (CDs), one set of rRNA genes and 31 tRNA genes, and the plasmids contain 24 CDs. Among the total of 697 CDs, functions can be assigned to 352 CDs (51%), and 232 CDs are unique to AY-WB phytoplasmas. The AY-WB phytoplasma chromosome is 154,062-bp smaller than the OY phytoplasma chromosome and contains 81 less CDs. However, the metabolic genes in both genomes are mostly similar. The high numbers of paralogs are predominantly accountable for the difference in genome size of AY-WB and OY. AY-WB phytoplasma genome contains 15 ATP-binding cassette (ABC) transporters for nutrient uptake, and the required components for bacterial type II protein translocation. The glucose phosphotransferase gene in the glycolysis pathway was not identified in the AY-WB phytoplasma genome suggesting that it might not be able to utilize glucose as a carbon and energy source. However, phytoplasmas may use malate as an alternative carbon and energy source. AYWB phytoplasma has a limited biosynthesis and energy production capacity, which is in consistent with the parasitic life style of phytoplasmas. Multiple copies of transposase and paralogous genes, and truncated versions of these sequences, were identified in the AY-WB genome, suggesting that the genome is prone to recombination. Furthermore, the AY-WB phytoplasma genome contains an incomplete type I restriction and modification

77

system making it accessible for import and integration of foreign genetic materials. A total of 50 CDs encode soluble secreted proteins were identified that might interact with host cell components, and hence are potential virulence factors. The AY-WB phytoplasma genome also contains the pore-forming toxin hemolysin possibly involved in phytoplasmas invasion of host cells. The information revealed by the genome sequence is essential for the investigation of the biology, physiology and pathogenicity of AY-WB phytoplasma, and is useful for understanding why phytoplasmas are not cultivable in cellfree media.

3.2 Introduction Phytoplasmas are wall-less prokaryotes of the Class Mollicutes, unique bacteria characterized by small genomes with low GC contents and no cell wall (Razin et al., 1998). Phytoplasmas are insect-transmitted plant pathogens that invade and replicate in both insect and plant cells. Phytoplasmas were derived from Gram-positive ancestors by reductive evolution (Woese, 1987; Weisburg et al., 1989; Oshima et al., 2004). In contrast to spiroplasmas, the other group of mollicutes that contains insect-transmitted plant pathogens, phytoplasmas appear to have undergone more excessive genome reductions and are likely evolutionarily late organisms (Bai et al., 2004b). Due to their uncultivable nature and the scarce of available genetic tools, little is known about the biology, physiology and pathogenicity of phytoplasmas. Members of the Class Mollicutes have been the subjects of genome sequencing efforts for years, because of their small genomes and clinical and economical importance.

78

So far, the complete genomes of 11 mollicutes have been reported, including Mycoplasma genitalium (NC_000908, Fraser et al., 1995), M. pneumoniae (NC_000912, Himmelreich et al., 1996), Ureaplasma urealyticum (NC_002162, Glass et al., 2000), M. pulmonis (NC_002771, Chambaud et al., 2001), M. penetrans (NC_004432, Sasaki et al., 2002), M. gallisepticum (NC_004829, Papazisi et al., 2003), OY phytoplasma (NC_005303, Oshima et al., 2004), M. mycoides subsp. mycoides (NC_005364, Westberg et al., 2004), M. mobile (NC_006908, Jaffe et al., 2004), Mesoplasma florum (NC_006055) and M. hyopneumoniae (NC_006360, Minion et al., 2004). The genome sequence data led to the identification of a minimal gene set for a free-living cell (Fraser et al., 1995; Mushegian and Koonin, 1996) and other advances in mollicute research. Specifically, the release of the first complete phytoplasma genome sequence (Oshima et al., 2004) greatly advanced the phytoplasma research. Aster yellows witches' broom (AY-WB) phytoplasma is a strain of aster yellows phytoplasma, the largest group of phytoplasmas that was recently assigned a tentative species name of Candidatus Phytoplasma asteris (Lee et al., 2004). AY-WB phytoplasma was first described in Ohio where it causes severe damage to lettuce and China aster (Zhang et al., 2004). Here, we report the complete genome sequence of AY-WB phytoplasma, and comparative analysis of this genome with that of the closely related OY phytoplasma. The analysis of AY-WB phytoplasma genome sequence data led to a better understanding of the biology and pathogenicity mechanisms of phytoplasmas, and the reason why phytoplasmas are not cultivable. The annotation and analysis of AY-WB phytoplasma genome are directional for other phytoplasma genome sequencing projects

79

that are currently underway, such as the maize bushy stunt phytoplasma (MBSP) genome-sequencing project.

3.3 Materials and Methods 3.3.1 Phytoplasma strain The aster yellows phytoplasma (Candidatus Phytoplasma asteris) strain aster yellows witches' broom (AY-WB) was isolated from diseased lettuce plants (Lactuca sativa) in Ohio (Zhang et al., 2004). The AY-WB strain was maintained by serial transmission to China aster (Callistephus chinensis) plants using aster leafhoppers (Macrosteles quadrilineatus L.) in greenhouse and growth chambers.

3.3.2 DNA manipulation Batches of phloem sap containing AY-WB phytoplasma were collected from diseased lettuce plants and mixed with STE buffer (0.1 M NaCl, 10 mM Tris-HCl, 1 mM EDTA, pH 8.0) at a ratio of 1:3 (sap/buffer). The mixture was kept on ice, divided into 1.5 ml Eppendorf tubes, and centrifuged at 9,000 rpm for 15 min at 4 °C. The pellet was resuspended in low-melting agarose (Promega, Madison, WI) and subject to isolation with pulsed field gel electrophoresis (PFGE) as described before (Zhang et al., 2004). Phytoplasma genomic DNA was eluted from the gel blocks with Elutrap (Schleicher & Schuell, Keene, NH) following the manufacturer's instruction. The genomic DNA was ethanol precipitated following standard procedure (Sambrook et al., 1989) and

80

resuspended in deionized distilled water. The concentration of the purified DNA was assessed using PicoGreen kit (Molecular Probes, Eugene, OR).

3.3.3 DNA sequencing and assembly The shotgun library was constructed using 5 µg AY-WB phytoplasma genomic DNA isolated from pulsed field gels (Zhang et al., 2004) at Integrated Genomics, Inc. (IG) (2201 W Campbell Park, Chicago, IL 60612). The DNA was sheared into an average of 2 kb fragments and cloned into the pGEM-3Z vector for transformation into Escherichia coli strain DH5α. The genomic DNA library was sequenced using MegaBACE 1000 (Amersham Biosciences, Sweden) and ABI3700 (Applied Biosystems, Foster City, CA) sequencers. The library was sequenced to saturation at 7-fold coverage of the AY-WB phytoplasma genome. Primary assembly of the individual sequences were performed using the Phred/Cross_match/Phrap package (originally developed at Washington University), which automatically basecalls the trace data, screens out vector, removes unreliable data, and assembles individual reads into contigs. Additional in-house-developed-tools helped in the validation of the assembly. Manual editing of read and contig sequences, as well as manipulations with the layout (i.e. tearing and joining of contigs, relocating reads, etc.), was then performed with the Consed editing software. Gaps in the assembly were covered by primer walking on gap-spanning clones. Areas of poor consensus quality and low coverage were also improved using this method. Sequencing oligos and templates were picked automatically using the Autofinish

81

software. We conducted four rounds of primers walking and in each round of primer walking custom oligos were designed to further extend the sequences in regions.

3.3.4 Identification of CDs, annotation and analysis The sequence data of AY-WB phytoplasma were uploaded into the IG database and software suite, ERGO, for sequence annotation. CRITICA (Badger and Olsen, 1999) and IG-proprietary tools were used for CD identification. The predicted CDs were annotated by sequence similarity search using BLAST algorithm (Altschul et al., 1997) against non-redundant (nr) database at the server maintained by the National Center for Biotechnology Information (NCBI). Protein domains were analyzed by searching against NCBI conserved domain (CD) database (Marchler-Bauer et al., 2003) and the pfam (protein family) database (Bateman et al., 2004). Proteins were classified into COGs (clusters of orthologous groups) at NCBI. The Kyoto Encyclopedia of Genes and Genomes (KEGG) was used for the reconstruction of the metabolic pathways. The assignment of enzyme commission (EC) number was according to the BRENDA database (Schomburg et al., 2002). The repetitive sequences within the genome were predicted using REPuter program (Kurtz and Schleiermacher, 1999) hosted at the Bielefeld University Bioinformatics Server. The signal peptide of AY-WB phytoplasma predicted CDs was predicted using SignalP program (version 3.0) (Bendtsen et al., 2004). The transmembrane domains were predicted using TMHMM v2.0 program (Krogh et al., 2001). The presence of plant nuclear localization signals (NLS) was predicted using pSORT (Nakai and Horton, 1999) with the CDs excluding the signal peptides if present.

82

3.4 Results 3.4.1 General genome features The genome of AY-WB phytoplasma consists of a 706,569-bp single circular chromosome (Table 3.1) and 4 plasmids with sizes of 3,972 bp, 4,009 bp, 5,104 bp, and 4,316 bp (Fig. 3.1). The AY-WB phytoplasma chromosome has a low GC content (27%) and 74% of the chromosome is coding regions. The AY-WB phytoplasma chromosome contains 673 predicted coding sequences (CDs). Among the 673 CDs, functions can be assigned to 345 CDs (51%), and 216 CDs (32%) are unique to AY-WB phytoplasmas. The remaining 112 CDs of AY-WB phytoplasma are significantly similar to the hypothetical proteins in databases. The four plasmids encode 24 CDs, 7 of which match to proteins with functional annotations and 16 CDs match to hypothetical proteins. AYWB chromosome contains one ribosomal RNA operon and 31 transfer RNA genes, whereas the OY phytoplasma chromosome contains two ribosomal RNA operons and 32 transfer RNA genes. The AY-WB phytoplasma genome is 154,062 bp smaller and contains 81 less CDs than the OY genome (Oshima et al., 2004). With a cutoff expectation (E) value of 10-5, AY-WB phytoplasma has 72 unique CDs (11%) and OY phytoplasma has 76 unique CDs (10%).

3.4.2 Functional categories of the predicted CDs The comparison of functional categories in the cluster of orthologous groups (COGs) between AY-WB phytoplasma CDs and those of other mollicutes revealed

83

interesting features (Table 3.2). Several mollicutes genomes were selected for the comparison. The M. genitalium genome is the smallest genome sequenced so far (Fraser et al., 1995). The M. pneumoniae genome (Himmelreich et al., 1996) is representative of the hominis and pneumoniae groups of mycoplasmas. M. penetrans has a slightly bigger genome compared to other mollicutes (Sasaki et al., 2002) and it can invade host cells, which is a characteristic similar to phytoplasmas, but not to most other mycoplasmas (Razin et al., 1998). The M. mobile genome is the most recently reported genome (Jaffe et al., 2004) and it has been a model for studying the sliding movement of mycoplasmas (Piper et al., 1987; Uenoyama et al., 2004). All the above Mycoplasma species are obligate parasites of humans or animals. AY-WB and OY genomes harbor similar amount of genes involved in translation and transcription. The phytoplasma genomes contain many copies of RNA polymerase sigma factors, i.e. 14 for OY phytoplasma and 6 for AY-WB phytoplasma. In contrast, M. genitalium genome contains only one sigma factor (Fraser et al., 1995). Even in the genome of M. penetrans that contains the largest numbers of genes in transcription, there is only one copy of sigma factor (Sasaki et al., 2002). Another interesting feature is that phytoplasma genomes harbor significantly more recombination-related genes than other mollicutes. Both phytoplasmas encode Na+-driven drug efflux pumps and ABC multidrug transporters, whereas other mollicutes employ only ABC transporters for drug export. AY-WB phytoplasma contains one component of type I restriction and modification

84

system involved in defense mechanisms, whereas OY phytoplasma contains multiple components. Phytoplasma genomes vary in the number of genes involved in cell motility. OY phytoplasma genome has two hypothetical proteins (PAM458 and PAM696) that are involved in cell motility whereas AY-WB phytoplasma has none. M. penetrans (Sasaki et al., 2002) and M. mobile (Jaffe et al., 2004) were noted for their motility and the genomes harbor 12 and 6 cell motility genes, respectively. AY-WB phytoplasma is able to invade and multiply in host cells (Lee et al., 2000). Thus, the phytoplasma invasion process involves unidentified motility genes. Phytoplasmas contain less metabolic and transport genes than other mollicutes, suggesting that phytoplasmas are more dependent on metabolites produced by their hosts than mycoplasmas do.

3.4.3 Metabolism AY-WB and OY phytoplasma genomes contain similar metabolic genes. Both phytoplasmas have most genes in the glycolysis pathway except the hexokinase gene essential for the conversion of glucose to glucose-6-phosphate or glucose phosphotransferase (PTS) genes (Fig. 3.2). The energy production of phytoplasmas seems limited to the glycolysis pathway, because no genes in the pentose phosphate pathway, tricitrate cycle (TCA), or hydrogen-driven ATP synthase genes have been identified in either genome. Phytoplasmas seems to be able to synthesize purine triphosphate via the salvage pathway. However, some important genes in the salvage

85

pathway are missing, such as those encoding dCMP deaminase (EC 3.5.4.12), CTP synthase (EC 6.3.4.2) and purine-nucleoside phosphorylase (EC 2.4.2.1). Therefore, the phytoplasma nucleotide salvage pathways are incomplete. In both phytoplasmas, the glycolysis pathway is connected to glycolipid metabolism by fructose bisphosphate aldolase (EC 4.1.2.13), which produces glycerone phosphate. The glycolipid pathway leads to the formation of phosphatidylethanolamine (PE), an essential component of the lipid bilayer of plasma membranes. The PE biosynthesis pathway was not identified in M. genitalium (Oshima et al., 2004). Phytoplasmas have limited biosynthesis capacities. The fructose-bisphosphatase (EC 3.1.3.11) that converts fructose-1,6-bisphosphate into fructose-6-phosphate is absent from both phytoplasma genomes. Therefore, phytoplasmas cannot synthesize glucose via the gluconeogenesis pathway. Phytoplasmas can convert pyruvate into acetyl-CoA by the pyruvate dehydrogenase complex. However, the fate of acetyl-CoA is unclear since no downstream enzymes have been identified. One possible role of acetyl-CoA is to donate an acetyl group to the glycolipid metabolic pathway. AY-WB phytoplasma contains selenocysteine lyase (EC 4.4.1.16) that synthesizes selenocysteine in the selenoamino acid biosynthesis pathway, and glutamine-dependent NAD(+) synthetase (EC 6.3.5.1) involved in NAD biosynthesis. Both enzymes need amino acids as substrates, which phytoplasmas cannot de novo synthesize. The phytoplasmas harbor genes involved in the folate and tetrahydrofolate biosynthesis (Fig. 3.2) suggesting that phytoplasmas have the ability to provide one-carbon units to important metabolic pathways, such as the nucleic acid synthesis pathway.

86

3.4.4 Carbohydrate transport and metabolism Phytoplasmas inhabit sugar-rich environments, such as plant phloem tissues. Therefore, it is unexpected that the genomes of AY-WB and OY phytoplasmas harbor significantly less amount of genes in the category of carbohydrate transport and metabolism than their mycoplasma counterparts. Even in the 580-kb genome of M. genitalium, 26 carbohydrate transport and metabolism genes were identified (Fraser et al., 1995). In contrast, only 19 genes are present in the 860-kb OY phytoplasma genome (Oshima et al., 2004) and 14 genes in the 706-kb AY-WB phytoplasma genome. Phytoplasmas may use malate rather than glucose or sucrose as the carbon and energy source. Malate is an intermediate metabolite in the tri-citrate (TCA) cycle, which can be used as carbon and energy sources in many bacteria (Krom et al., 2003). Two malate/citrate-sodium symport genes have been identified in both AY-WB (AYWB052 and AYWB438) and OY (39938772 and 39939206) phytoplasma genomes. Malate can be converted to pyruvate by NAD-specific malic enzyme (AYWB051 and 39939207). The usage of malate as a carbon and energy source is advantageous, because (i) malate is readily available in the cytoplasm of insect and plant cells, where phytoplasmas reside, and (ii) it needs fewer genes for energy production. In addition, AY-WB and OY phytoplasmas may be able to utilize maltose as carbon sources. A complete set of maltose transporter genes was identified in AY-WB phytoplasma genome, including genes for the maltose transport ATP-binding protein (malK, AYWB672), maltose transport system permease protein (malF, AYWB671; malG, AYWB670), and maltose-binding protein (malE, AYWB669). Similarly, OY

87

phytoplasma has two copies of malK, and one copy of malF (ugpA), malG (ugpE), and malE (ugpB). In contrast to spiroplasmas, phytoplasmas do not contain sucrose transporters or fructose operon components. Further, the sucrose phosphorylase gene that is important for sucrose degradation is absent from the AY-WB phytoplasma genome, and interrupted by a premature stop codon and not functional in the OY phytoplasma genome (Oshima et al., 2004). Therefore, phytoplasmas apparently cannot use sucrose or fructose as carbon sources. Whereas genes encoding maltose-degrading enzymes were not identified in the phytoplasma genomes, the existence of maltose transport genes suggests that maltose plays a role in phytoplasma metabolism. Phytoplasmas have to survive the high osmosis pressure in plant phloem tissues (Lee et al., 2000). The import of carbohydrates, such as maltose, is probably important for the maintenance of the osmotic balance. Previously, fructose utilization was shown to affect the plant pathogenicity of spiroplasmas (Foissac et al., 1997; Gaurivaud et al., 2000a, 2000b, 2001), however for phytoplasmas the utilization of maltose in plants could contribute to plant pathogenicity. Indeed, the Arabidopsis plant with 40-time elevated level of maltose were stunting with lower chlorophyll contents (Niittylä et al., 2004).

3.4.5 ABC transporter The genomes of both AY-WB and OY phytoplasmas harbor 15 ABC transporter genes (Table 3.3). As evidenced by the limited metabolic capacity, phytoplasmas rely on ABC transporters to uptake a wide variety of compounds, such as sugars, ions, peptides

88

and more complex organic molecules, from insect vectors and plant hosts. Other than providing nutrition to phytoplasmas, ABC transporters can also participate in conjugative DNA transfer, chemotaxis, and virulence (Detmers et al., 2001). Indeed, a solute-binding protein Sc76 of the ABC transporter complex is involved in the insect transmission of plant pathogenic spiroplasmas (Boutareaud et al., 2004). No Sc76 homolog was identified in phytoplasmas; however, phytoplasmas harbor many genes of solute-binding proteins. The largest group of phytoplasma ABC transporters is involved in transporting dipeptide/oligopeptide and amino acids (Table 3.3). Similar to mycoplasmas, AY-WB and OY phytoplasmas have no amino acid biosynthesis genes, making them rely on the transport of amino acids from host cells. AY-WB phytoplasma seems to have 4 ABC transporters for import of glutamine, arginine, methionine, and likely other amino acids, whereas OY phytoplasmas contain 7 ABC-type amino acid transporters (Table 3.3). However, only 3 of the 7 transporter systems contain ATPase, which was considered essential for the transport process (Davidson and Chen, 2004). This is in consistence with the earlier observation that the OY phytoplasma genome contains multiple redundant copies of transporter genes (Oshima et al., 2004). Phytoplasmas also import dipeptide/oligopeptide from host cells, as multiple dipeptide/oligopeptide transport systems have been identified in both AY-WB and OY phytoplasmas (Table 3.3). The imported dipeptides or oligopeptides are likely digested by peptidases, because a gene for Xaa-His peptidase (EC 3.4.13.3) was identified in AYWB (AYWB434) and OY (39938776). OY phytoplasma, but not AYWB, has a XaaPro aminopeptidase (E.C. 3.4.11.9, pepP, 39938731) as well.

89

Phytoplasmas have ABC transporters devoted to inorganic ion uptake (Table 3.3). Two cobalt ABC transporter systems are present in both AY-WB and OY phytoplasma genomes. Both phytoplasmas have functional ABC systems to import cation such as zinc and manganese ions. These cation-transport ABC transporters, together with other ion transport P-type ATPases (see section "other transporters"), provide the inorganic ions as nutrients or enzyme ligands. Both AY-WB and OY phytoplasmas have systems devoted to the export of toxic substances. AY-WB phytoplasma contains two copies of multidrug resistance ATPbinding and permease proteins, while only one copy is present in the OY phytoplasma genome (Table 3.3). However, in another plant pathogenic mollicute, Spiroplasma kunkelii, 7 ABC transporters conferring multidrug resistance have been identified (Zhao et al., 2004). Plant pathogenic spiroplasmas and phytoplasmas share similar environmental niches, and therefore likely encounter similar challenges from toxic substances. The relatively small amount of the multidrug resistance ABC transporters in phytoplasmas could be compensated by the three copies of norM genes encoding Na+driven multidrug efflux pumps (AYWB442, AYWB444, and AYWB653 for AY-WB; 39938766, 39938768, and 39939220 for OY) (Fig. 3.2). Spiroplasmas do not have orthologs of Na+-driven multidrug efflux pump genes. AY-WB and OY have ABC transporter genes (phnL) involved in lipoprotein release (AYWB621 and 39938582, respectively). OY phytoplasma genome also contains a second potential component (nlpA, 39938583). The function of lipoprotein release is

90

essential because the deletion of the genes in the lipoprotein translocation complex LolCDE is lethal to E. coli (Narita et al., 2002). AY-WB and OY phytoplasmas also contain ABC transporter systems importing spermidine/putrescine. Spermidine and putrescine are polyamines distributed in a wide range of organisms from bacteria to plants and animals (Tabor and Tabor, 1984), and are likely important for phytoplasmas by providing nitrogen nutrients. The phytoplasma import of polyamines during the infection of plants could alter the level of spermidine and putrescine in plants and could contribute to symptom development (Walters, 2003).

3.4.6 P-type ATPase transporters AY-WB phytoplasma genome encodes 5 P-type ATPase for the transport of inorganic ions such as calcium, magnesium, lead, mercury, zinc, and cadmium (Table 3.4). These genes are conserved in OY phytoplasmas, suggestive of the importance of these genes in phytoplasmas cation uptake. These inorganic ions are also important for plant growth and development. It was reported that phytoplasma infection could significantly affect the plant magnesium uptake (de Oliveira et al., 2002), resulting in the accumulation of these inorganic ions and a slow growth rate of the plants. The accumulated magnesium could serve as a nutrient source for phytoplasmas (de Oliveira et al., 2002). The cation transporting ATPase in phytoplasmas may play an important physiological role in phytoplasma cells, because the deletion of cation P-ATPase resulted in a hypersensitivity to hyperosmotic media of Synechococcus cells (Kanamaru et al.,

91

1993). Indeed, the wall-less phytoplasmas are likely osmotically more sensitive than the walled bacteria (Razin, 1978; Razin et al., 1998). The cation P-type ATPase could also serve as the cation efflux apparatus conferring bacterial resistance to heavy metal ions (Nies, 2003).

3.4.7 Repetitive DNA and coding sequences Both AY-WB phytoplasma and OY phytoplasma genomes are rich in repetitive sequences (Fig. 3.3). AY-WB phytoplasma genome harbors more repetitive sequences than the OY phytoplasma genome. The largest AY-WB phytoplasma repeats are the inverted repeats of the one and only ribosomal RNA (rRNA) operon and the homologous region consisting of 10 tRNA genes and two ABC transporter genes (artI and artM) in the reverse direction. In contrast, OY phytoplasma has a similar repeat region consisting of the inverted repeats of two rRNA operons. A unique feature in both AY-WB and OY phytoplasma genomes is the repetitive regions containing dnaG (DNA primase) and dnaB (replicative DNA helicase) genes. This organization is not present in other currently sequenced bacteria genomes. The OY phytoplasma genome contains two large 19 kb repetitive regions about 20 kb apart. Both regions contain 20 CDs, including one copy of uvrD (ATP-dependent DNA helicase), 2 copies of hflB (ATP-dependent Zn protease), 2 copies of himA (bacterial nucleoid DNA-binding protein), 2 copies of ssb (single-stranded DNA-binding protein), 2 copies of fliA (DNA-directed RNA polymerase specialized sigma subunit) and some hypothetical proteins. Most genes in the repetitive regions of both phytoplasma

92

genomes are involved in DNA replication and transcription. They are possibly the result of gene duplication events. The AY-WB phytoplasma genome harbors 4 copies of complete transposase genes and 14 copies of transposase pseudogenes (Fig. 3.4). A pseudogene is a gene copy that does not produce a functional, full-length protein (Vanin, 1985). Seven transposase genes were disrupted into two or three CDs in AY-WB phytoplasma genome. Another seven CDs have sequence similarity to only part of the complete transposase genes. The identities between the gene sequences over the aligned regions are mostly higher than 70% and some are over 95%. The AY-WB phytoplasma genome contains functional ATP-dependent DNA helicase genes (uvrD) and multiple copies of uvrD pseudogenes (Fig. 3.5). uvrD gene is involved in bacterial DNA repair (Crowley and Hanawalt, 2001). OY phytoplasma genome harbors seven copies of uvrD genes, while AY-WB phytoplasma harbors only two copies of uvrD genes (AYWB085 and AYWB115). Other than the transposase and ATP-dependent DNA helicase genes, AY-WB phytoplasma genome contains multiple copies of genes encoding DNA primase, replicative DNA helicase, ATP-dependent Zn protease, RNA polymerase sigma factors, thymidylate kinase, single-stranded DNA binding protein, bacterial nucleoid DNA binding protein, site-specific DNA methylase, replication initiator protein (on plasmids), and some hypothetical proteins (Table 3.5). Similar to the transposase genes, some of the gene copies are pseudogenes. For instance, the pseudogenes AYWB208 and AYWB209 are 149 bp apart and highly similar to the N-terminal (83%) and C-terminal (95%) of the

93

functional site-specific DNA methylase genes (AYWB382). In addition, the AY-WB phytoplasma genome contains large amount of repetitive sequences in non-coding regions (data not shown). These tandem and inverted repeats are potential recombination sites leading to the loss of gene function, gene duplication or gene deletion.

3.4.8 Virulence factors The AY-WB phytoplasma genome harbors some known virulence factors. Two copies of genes encoding hemolysin or derivatives (hylC, AYWB563 and tlyC, AYWB568) have been identified in the genome. Hemolysin is an extracellular poreforming toxin that has been identified in both Gram-positive bacterial pathogen, such as Staphylococcus aureus (Menestrina et al., 2003), and Gram-negative bacterial pathogen, such as uropathogenic Escherichia coli (UPEC) (Emody et al., 2003). The hemolysin gene products are possibly involved in the host cell invasion of phytoplasmas. AY-WB phytoplasma is an intracellular pathogen whose secreted or membranebound proteins are likely involved in direct or indirect interactions with host cell components. The AY-WB phytoplasma has a Sec-dependent (type II) protein secretion system consisting of secA (AYWB307), secE (AYWB470), and secY (AYWB504). No genes encoding the components of the type III or IV secretion systems have been identified in the AY-WB phytoplasma genome. Fifty phytoplasma proteins are potentially secreted by the Sec-dependent pathway because they contain signal peptides as predicted with the SignalP program (Bendtsen et al., 2004) (Table 3.6). As expected, 5 solute binding protein components associated with the 5 ABC transporter systems also

94

have signal peptides. Among the other 45 potentially secreted proteins, 42 cannot be assigned functions based on sequence similarity searches. Among them, 25 are common in both AY-WB and OY phytoplasma genomes and the other 17 are unique to AY-WB phytoplasma. The subcellular localization of these proteins in plant cells has implications in their functions, and therefore other web-based software was used to look for subcellular localization domains. Indeed, seven proteins potentially target the plant cell nuclei based on the presence of a nuclear localization signal (NLS), suggesting that these proteins are involved in the regulation of the replication and transcription of plant genes. Further, three proteins target microbodies and three target chloroplasts. The remaining 29 proteins are localized in the plant cell cytoplasm.

3.5 Discussion AY-WB phytoplasma is an intracellular pathogen that invades and replicates in cells of insect vectors and plant hosts. Within the relatively isolated environment, the reductive evolution was thought to occur through intrachromosomal recombination events at repeated sequences (Achaz et al., 2002). The presence of the repetitive sequences in the AY-WB phytoplasma genome suggests that phytoplasmas are prone to frequent recombination events. The genome contains 4 functional transposase gene genes and 14 transposase pseudogenes. Transposase genes are capable of moving DNA segments to new locations within and between genomes (Rice and Baker, 2001). The transposase pseudogenes are likely the remnants of frequent transposition events. Furthermore, the AY-WB phytoplasma genome harbors many pseudogenes that are

95

portions of other functional genes. Therefore, AY-WB phytoplasma genome has undergone frequent recombination and transposition events. However, the pseudogenes might gain new functions during evolution as recently demonstrated (Hirotsune et al., 2003). Also present in the AY-WB phytoplasma genome are several copies of DNA replication protein-encoding genes having no homologs in other bacteria genomes. The functional copies of these genes are present in plasmids, suggesting that phytoplasmas obtained these genes from other organisms. The non-functional copies (pseudogenes) of these genes are in the chromosome, which are likely the remnants of the recombination events between the plasmids and the chromosome. Bacteria usually use a type I restriction and modification system to defend from outer source DNA (Murray, 2000). However, AY-WB phytoplasma has limited modification capacity and no restriction capacity. Furthermore, AY-WB phytoplasma contains fewer copies of the uvrD gene that is involved in DNA repair (Crowley and Hanawalt, 2001) than OY phytoplasma. All these suggested that the phytoplasma genomes are prone to active recombination or transposition events. Phytoplasmas have not been successfully cultured in cell-free media, which has hampered the research on phytoplasmas. The complete phytoplasma genome sequences of AY-WB and OY phytoplasmas may provide some clues about the reason causing the failure of the culturing attempts. Based on our findings, the carbon source could be the first limiting factor. Bacteria can transport extracellular glucose via glucose phosphotransferase (PTS) system and such a system has been identified in many bacteria

96

species (Postma et al., 1993), including some mollicutes species such as M. genitalium (Fraser et al., 1995). However, phytoplasmas do not have a PTS system, suggesting that phytoplasmas may not be able to uptake extracellular glucose. Phytoplasmas have all other enzymes in the glycolysis pathway. The culturing media supplemented with glucose-6-phosphate but not glucose may be a solution. The second limiting factor could be the presence of inhibitory substances in the culture media (Razin et al., 1998). Phytoplasmas have fewer multidrug resistance ABC transporters than their cultivable spiroplasma counterpart, suggesting that phytoplasmas may have weaker detoxication abilities than spiroplasmas do. The possible third limiting factor is that phytoplasmas may need anaerobic conditions to grow. It was shown that phytoplasmas grow better in nonphotosynthetic tissues that contain less oxygen than in photosynthetic tissues that are rich in oxygen (Sears et al., 1997). The AY-WB phytoplasma genome contains no genes dealing with the reactive oxygen species. Phytoplasmas are insect-transmitted plant pathogens able to induce physiological changes in plants. One aim of the genome-sequencing project is to understand the pathogenicity mechanisms of phytoplasmas. Most of the effector proteins identified so far are from Gram-negative bacterial plant pathogens and introduced into plant cells by bacteria type III secretion systems. The effectors could induce defense-related hypersensitive responses in resistant plants containing resistance genes, and diseases in susceptible plants. For instance, the AvrBs2 protein in Xanthomonas campestris pv. vesicatoria (Kearney and Staskawicz, 1990) could induce diseases when introduced into susceptible plant hosts. AY-WB phytoplasma genome contains hemolysin toxins, which

97

can form pores on the plasma membranes (Menestrina et al., 2003). Phytoplasmas hemolysin gene products are possibly involved in host cell invasion of phytoplasmas. No other known pathogenic genes has been identified in the AY-WB phytoplasma genome. However, potential effector proteins have been predicted in the genome based on the notion that secreted proteins may directly or indirectly interact with host cell components. Among the potential effector proteins, 7 were predicted to target plant cell nuclei, suggesting that they could affect the replication or transcription of plant genes. The effects of these genes on plants could be assessed using potato virus X-based transient expression systems (Jones et al., 1999) and other techniques. Currently, the functional characterization of these potential effectors is underway. The verification of the functions by mutation and complementation is only possible after the successful culturing of phytoplasmas. The complete genome of AY-WB phytoplasma confirms the reductive evolution revealed by the OY phytoplasma genome, and demonstrates that the AY-WB phytoplasma genome has undergone frequent recombination events. Most importantly, it provides the genetic basis of valid researches on phytoplasma pathogenicity. The further characterization of phytoplasma genomes is expected to provide answers to the questions about the culturing and pathogenicity of phytoplasmas.

3.6 Acknowledgments The authors thank Melanie L. Ivy and Sophien Kamoun in the Department of Plant Pathology at The Ohio State University - OARDC for technical support and

98

constructive discussions. The authors also thank Alla Lapidus, Nikos Kyrpides, and Agnes Radek for their assistance in the AY-WB phytoplasma genome-sequencing project. This project was supported by USDA/NSF Microbial Genome Sequencing Program, Grant Number 2002-35600-12752.

3.7 References Achaz, G., Rocha, E.P.C., Netter, P. and Coissac, E. (2002) Origin and fate of repeats in bacteria. Nucleic Acids Res. 30, 2987-2994. Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W. and Lipman, D.J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389-3402. Bai, X., Fazzolari, T. and Hogenhout, S.A. (2004a) Identification and characterization of traE genes of Spiroplasma kunkelii. Gene 336, 81-91. Bai, X., Zhang, J., Holford, I.R. and Hogenhout, S.A. (2004b) Comparative genomics identifies genes shared by distantly related insect-transmitted plant pathogenic mollicutes. FEMS Microbiol. Lett. 235, 249-258. Bateman, A., Coin, L., Durbin, R., Finn, R.D., Hollich, V., Griffiths-Jones, S., Khana, A., Marshall, M., Moxon, S., Sonnhammer, E.L.L., Holme, D.J.S., Yeats, C. and Eddy, S.R. (2004) The pfam protein families database. Nucleic Acids Res. 32, D138-D141. Badger, J.H. and Olsen, G.J. (1999) CRITICA: coding region identification tool invoking comparative analysis. Mol. Biol. Evol. 16, 512-524. Bendtsen, J.D., Nielsen, H., von Heijne, G. and Brunak, S. (2004) Improved prediction of signal peptides: SignalP 3.0. J. Mol. Biol. 340, 783-795. Boutareaud, A., Danet, J.L., Garnier, M. and Saillard, C. (2004) Disruption of a gene predicted to encode a solute binding protein of an ABC transporter reduces transmission of Spiroplasmas citri by the leafhopper Circulifer haematoceps. Appl. Environ. Microbiol. 70, 3960-3967. 99

Chambaud, I., Heilig, R., Ferris, S., Barbe, V., Samson, D., Galisson, F., Moszer, I., Dybvig, K., Wroblewski, H., Viari, A., Rocha, E.P.C. and Blanchard, A. (2001) The complete genome sequence of the murine respiratory pathogen Mycoplasma pulmonis. Nucleic Acids Res. 29, 2145-2153. Crowley, D.J. and Hanawalt, P.C. (2001) The SOS-dependent upregulation of uvrD is not required for efficient nucleotide excision repair of ultraviolet light induced DNA photoproducts in Escherichia coli. Mutat. Res. 485, 319-329. Davidson, A.L. and Chen, J. (2004) ATP-binding cassette transporters in bacteria. Annu. Rev. Biochem. 73, 241-268. de Oliveira, E., Magalhães, P.C., Gomide, R.L., Vasconcelos, C.A., Souza, I.R.P., Oliveira, C.M., Cruz, I. and Schaffert, R.E. (2002) Growth and nutrition of mollicuteinfected maize. Plant Dis. 86, 945-949. Detmers, F.J.M., Lanfermeijer, F.C. and Poolman, B. (2001) Peptides and ATP binding cassette peptide transporters. Res. Microbiol. 152, 245-258. Emody, L., Kerenyi, M. and Nagy, G. (2003) Virulence factors of uropathogenic Escherichia coli. Int. J. Antimicrob. Agents S2, 29-33. Foissac, X., Danet, J.L., Saillard, C., Gaurivaud, P., Laigret, F., Pare, C. and Bové, J.M. (1997) Mutagenesis by insertion of Tn4001 into the genome of Spiroplasma citri: Characterization of mutants affected in plant pathogenicity and transmission to the plant by the leafhopper vector Circulifer haematoceps. Mol. Plant-Microbe Interact. 10, 454-461. Fraser, C.M., Gocayne, J.D., White, O., Adams, M.D., Clayton, R.A., Fleischmann, R.D., Bult, C.J., Kerlavage, A.R., Sutton, G.G., Kelley, J.M., Fritchman, J.L., Weidman, J.F., Small, K.V., Sandusky, M., Fuhrmann, J.L., Nguyen, D.T., Utterback, T., Saudek, D.M., Phillips, C.A., Merrick, J.M., Tomb, J., Dougherty, B.A., Bott, K.F., Hu, P.C., Lucier, T.S., Peterson, S.N., Smith, H.O. and Venter, J.C. (1995) The minimal gene complement of Mycoplasma genitalium. Science 270, 397-403. Gaurivaud, P., Danet, J.L., Laigret, F., Garnier, M. and Bové, J.M. (2000a) Fructose utilization and phytopathogenicity of Spiroplasma citri. Mol. Plant-Microbe Interact. 13, 1145-1155. Gaurivaud, P., Laigret, F., Garnier, M and Bové, J.M. (2000b) Fructose utilization and pathogenicity of Spiroplasma citri: characterization of the fructose operon. Gene 252, 61-69. 100

Gaurivaud, P., Laigret, F., Garnier, M. and Bové, J.M. (2001) Characterization of FruR as a putative activator of the fructose operon of Spiroplasma citri. FEMS Microbiol. Lett. 198, 73-78. Glass, J.I., Lefkowitz, E.J., Glass, J.S., Heiner, C.R., Chen, E.Y. and Cassell, G.H. (2000) The complete sequence of the mucosal pathogen Ureaplasma urealyticum. Nature 407, 757-762. Himmelreich, R., Hilbert, H., Plagens, H., Pirkl, E., Li, B.C. and Herrmann, R. (1996) Complete sequence analysis of the genome of the bacterium Mycoplasma pneumoniae. Nucleic Acids Res. 24, 4420-4449. Hirotsune, S., Yoshida, N., Chen, A., Garrett, L., Sugiyama, F., Takahashi, S., Yagami, K., Wynshaw-Boris, A. and Yoshiki, A. (2003) An expressed pseudogene regulates the messenger-RNA stability of its homologous coding gene. Nature 423, 91-96. Jaffe, J.D., Stange-Thomann, N., Smith, C., DeCaprio, D., Fisher, S., Butler, J., Calvo, S., Elkins, T., FitzGerald, M.G., Hafez, N., Kodira, C.D., Major, J., Wang, S., Wilkinson, J., Nicol, R., Nusbaum, C., Birren, B., Berg, H.C., and Church, G.M. (2004) The Complete Genome and Proteome of Mycoplasma mobile. Genome Res. 14, 1447-1461. Jones, L., Hamilton, A.J., Voinnet, O., Thomas, C.L., Maule, A.J. and Baulcombe, D.C. (1999) RNA-DNA interactions and DNA methylation in post-transcriptional gene silencing. Plant Cell 11, 2291-2301. Kanamaru, K., Kashiwagi, S., Mizuno, T. (1993) The cyanobacterium, Synechococcus sp. PCC7942, possesses two distinct genes encoding cation-transporting P-type ATPase. FEBS Lett. 330, 99-104. Kearney, B. and Staskawicz, B.J. (1990) Widespread distribution and fitness contribution of Xanthomonas campestris avirulence gene avrBs2. Nature 346, 385-386. Krogh, A., Larsson, B., von Heijne, G. and Sonnhammer, E.L. (2001) Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J. Mol. Biol. 305, 567-580. Krom, B.P., Warner, J.B., Konings, W.N. and Lolkema, J.S. (2003) Transporters involved in uptake of di- and tricarboxylates in Bacillus subtilis. Antonie Van Leeuwenhoek 84, 69-80.

101

Kurtz, S. and Schleiermacher, C. (1999) REPuter: fast computation of maximal repeats in complete genomes. Bioinformatics 15, 426-427. Lee, I.M., Davis, R.E. and Gundersen-Rindal, D.E. (2000) Phytoplasma: phytopathogenic mollicutes. Annu. Rev. Microbiol. 54, 221-255. Lee, I.-M. Gundersen-Rindal, D.E., Davis, R.E., Bottner, K.D., Marcone, C. and Seemüller, E. (2004) 'Candidatus Phytoplasma asteris', a novel phytoplasma taxon associated with aster yellows and related diseases. Int. J. Syst. Evol. Microbiol. 54, 1037-1048. Marchler-Bauer, A., Anderson, J.B., DeWeese-Scott, C., Fedorova, N.D., Geer, L.Y., He, S., Hurwitz, D.I., Jackson, J.D., Jacobs, A.R., Lanczycki, C.J., Liebert, C.A., Liu, C., Madej, T., Marchler, G.H., Mazumder, R., Nikolskaya, A.N., Panchenko, A.R,, Rao, B.S., Shoemaker, B.A., Simonyan, V., Song, J.S., Thiessen, P.A., Vasudevan, S., Wang, Y., Yamashita, R.A., Yin, J.J. and Bryant, S.H. (2003) CDD: a curated Entrez database of conserved domain alignments. Nucleic Acids Res. 31, 383-387. Menestrina, G., Dalla Serra, M., Comai, M., Coraiola, M., Viero, G., Werner, S., Colin, D.A., Monteil, H. and Prevost, G. (2003) Ion channels and bacterial infection: the case of beta-barrel pore-forming protein toxin of Staphylococcus aureus. FEBS Lett. 552, 54-60. Minion, F.C., Lefkowitz, E.J., Madsen, M.L., Cleary, B., Swartzell, S. and Mahairas, G.G. (2004) The genome sequence of Mycoplasma hyopneumoniae strain 232, the agent of swine mycoplasmosis. J. Bacteriol. In press. Murray, N.E. (2000) Type I restriction systems: sophisticated molecular machines (a legacy of Bertani and Weigle). Mirobiol. Mol. Biol. Rev. 64, 412-434. Mushegian, A.R. and Koonin, E.V. (1996) A minimal gene set for cellular life derived by comparison of complete bacterial genomes. Proc. Natl. Acad. Sci. USA. 93, 1026810273. Nakai, K. and Horton, P. (1999) pSORT: a program for detecting sorting signals in proteins and predicting their subcellular localization. Trends Biochem. Sci. 24, 34-36. Narita, S., Tanaka, K., Matsuyama, S. and Tokuda, H. (2002) Disruption of lolCDE, encoding an ATP-Binding Cassette transporter, is lethal for Escherichia coli and prevents release of lipoprotein from the inner membrane. J. Bacteriol. 184, 14171422. Nies, D.H. (2003) Efflux-mediated heavy metal resistance in prokaryotes. FEMS Microbiol. Rev. 27, 313-339. 102

Niittylä, T., Messerli, G., Trevisan, M., Chen, J., Smith, A.M., and Zeeman, S.C. (2004) A previously unknown maltose transporter essential for starch degradation in leaves. Science 303, 87-89. Oshima, K., Kakizawa, S., Nishigawa, H., Jung, H.Y., Wei, W., Suzuki, S., Arashida, R., Nakata, D., Miyata, S., Ugaki, M., and Namba, S. (2004) Reductive evolution suggested from the complete genome sequence of a plant-pathogenic phytoplasma. Nat. Genet. 36, 27-29. Papazisi, L., Gorton, T.S., Kutish, G., Markham, P.F., Browning, G.F., Nguyen, D.K., Swartzell, S., Madan, A., Mahairas, G. and Geary, S.J. (2003) The complete genome sequence of the avian pathogen Mycoplasma gallisepticum strain R(low). Microbiology 149, 2307-2316. Piper, B., Rosengarten, R. and Kirchhoff, H. (1987) The influence of various substance on the gliding motility of Mycoplasma mobile 163K. J.. Gen. Microbiol. 133, 31933198. Postma, P.W., Lengeler, J.W. and Jacobson, G.R. (1993) Phosphoenolpyruvate:carbohydrate phosphotransferase systems of bacteria. Microbiol. Rev. 57, 543-594. Razin, S. (1978) The mycoplasmas. Microbiol. Rev. 42, 414-470. Razin, S., Yogev, D. and Naot, Y. (1998) Molecular biology and pathogenicity of mycoplasmas. Microbiol. Mol. Biol. Rev. 62, 1094-1156. Rice, P.A. and Baker, T.A. (2001) Comparative architecture of transposase and integrase complexes. Nat. Struct. Biol. 8, 302-307. Riley, M. (1993) Functions of the gene products of Escherichia coli. Microbiol. Rev. 57, 862-952. Sasaki, Y., Ishikawa, J., Yamashita, A., Oshima, K., Kenri, T., Furuya, K., Yoshino, C., Horino, A., Shiba, T., Sasaki, T. and Hattori, M. (2002) The complete genomic sequence of Mycoplasma penetrans, an intracellular bacterial pathogen in humans. Nucleic Acids Res. 30, 5293-5300. Schomburg, I., Chang, A., Ebeling, C., Gremse, M., Heldt, C., Huhn, G. and Schomburg, D. (2004) BRENDA, the enzyme database: updates and major new developments. Nucleic Acids Res. 32, D431-D433.

103

Sears, B.B., Klomparens, K.L., Wood, J.I. and Schewe, G. (1997) Effect of altered levels of oxygen and carbon dioxide on phytoplasma abundance in Oenothera leaftip cultures. Physiol. Mol. Plant Pathol. 50, 275-287. Tabor, C.W. and Tabor, H. (1984) Polyamines. Annu. Rev. Biochem. 53, 749-790. Uenoyama, A., Kusumoto, A. and Miyata, M. (2004) Identification of 1 349-kilodalton protein (Gli349) responsible for cytadherence and glass binding during gliding of Mycoplasma mobile. J. Bacteriol. 186, 1537-1545. Vanin, E.F. (1985) Processed pseudogene: Characteristics and evolution. Annu. Rev. Genet. 19, 253-272. Walters, D.R. (2003) Polyamine and plant disease. Phytochemistry 64, 97-107. Weisburg, W.G., Tully, J.G., Rose, D.L., Petzel, J.P., Oyaizu, H., Yang, D., Mandelco, L., Sechrest, J., Lawrence, T.G., van Etten, J., Maniloff, J. and Woese, C.R. (1989) A phylogenetic analysis of the mycoplasmas: basis for their classification. J. Bacteriol. 171, 6455-6467. Westberg, J., Persson, A., Holmberg, A., Goesmann, A., Lundeberg, J., Johansson, K.E., Pettersson, B., and Uhlen, M. (2004) The genome sequence of Mycoplasma mycoides subsp. mycoides SC type strain PG1T, the causative agent of contagious bovine pleuropneumonia (CBPP). Genome Res. 14, 221-227. Woese, C.R. (1987) Bacterial evolution. Microbiol. Rev. 51, 221-271. Zhang, J., Hogenhout, S.A., Nault, L.R., Hoy, C.W. and Miller, S.A. (2004) Molecular and symptom analyses of phytoplasma strains from lettuce reveal a diverse population. Phytopathology 94, 842-849. Zhao, Y., Wang, H., Hammond, R.W., Jomantiene, R., Liu, Q., Lin, S., Roe, B.A. and Davis, R.E. (2004) Predicted ATP-binding cassette systems in the phytopathogenic mollicute Spiroplasma kunkelii. Mol. Gen. Genomics 271, 325-338.

104

Length (bp) G + C ratio Putative protein coding sequences (CDs) Coding region (%) Average CDS length (bp) CDs with functional assignmenta CDs with matches to conserved hypothetical proteins CDs without significant database match Ribosomal RNA operons tRNAs

AY-WB 706,569 27% 673 74% 779 345 112 216 1 31c

OY-M 860,631 28% 754 73% 785 446 51 257 2b 32c

Table 3.1 General features of the chromosomes of the AY-WB phytoplasma and OY phytoplasma genomes a

Functional assignment was performed using the classification scheme from Riley (1993). The two rRNA operons are adjacent to each other, and have an isoleucine tRNA gene in between. c This set of tRNAs correspond to all amino acids. b

105

Strain

Ca. Ca. Mycoplasma Mycoplasma Mycoplasma Mycoplasma Phytoplasma Phytoplasma penetrans pneumoniae genitalium mobile asteris asteris AY-WB OY-M HF-2 M129 G-37 163K

Genome size (bp)

706 569

Organism

G + C content (mol %) Total number of predicted CDs Functional categories in COGs a

860 631

1 358 633

816 394

580 074

777 079

27

28

25.7

40

32

24.9

673

754

1037

689

484

633

%

%

%

%

%

%

(-) not in COGs

259

38.5

178

23.6

289

27.9

258

37.4

99

20.5

114

18.0

J, Translation

103

15.3

104

13.8

109

10.5

102

14.8

101

20.9

106

16.7

K, Transcription

21

3.1

31

4.1

33

3.2

14

2.0

14

2.9

18

2.8

L, Replication, recombination and repair

92

13.7

147

19.5

101

9.7

43

6.2

40

8.3

64

10.1

D, Cell cycle control, mitosis and meiosis

7

1.0

18

2.4

29

2.8

5

0.7

5

1.0

7

1.1

V, Defense mechanisms

7

1.0

8

1.1

36

3.5

22

3.2

8

1.7

12

1.9

T, Signal transduction mechanisms

2

0.3

2

0.3

5

0.5

3

0.4

3

0.6

7

1.1

M, Cell wall/membrane biogenesis

6

0.9

12

1.6

17

1.6

12

1.7

12

2.5

17

2.7

N, Cell motility

0

0.0

2

0.3

12

1.2

0

0.0

0

0.0

6

0.9

U, Intracellular trafficking and secretion

6

0.9

7

0.9

33

3.2

7

1.0

6

1.2

15

2.4

O, Posttranslational modification, protein turnover, chaperones C, Energy production and conversion

25

3.7

51

6.8

35

3.4

20

2.9

20

4.1

22

3.5

12

1.8

16

2.1

32

3.1

20

2.9

20

4.1

28

4.4

14

2.1

19

2.5

58

5.6

37

5.4

26

5.4

46

7.3

28

4.2

40

5.3

31

3.0

24

3.5

15

3.1

21

3.3

F, Nucleotide transport and metabolism

19

2.8

24

3.2

39

3.8

21

3.0

21

4.3

21

3.3

H, Coenzyme transport and metabolism

5

0.7

9

1.2

14

1.4

14

2.0

14

2.9

16

2.5

G, Carbohydrate transport and metabolism E, Amino acid transport and metabolism

I, Lipid transport and metabolism

8

1.2

9

1.2

16

1.5

9

1.3

9

1.9

9

1.4

P, Inorganic ion transport and metabolism Q, Secondary metabolites biosynthesis, transport and catabolism R, General function prediction only

15

2.2

17

2.3

24

2.3

17

2.5

17

3.5

15

2.4

1

0.1

1

0.1

5

0.5

0

0.0

0

0.0

1

0.2

S, Function unknown

27

4.0

36

4.8

90

8.7

45

6.5

40

8.3

51

8.1

16

2.4

23

3.1

29

2.8

16

2.3

14

2.9

37

5.8

Table 3.2 Comparison of the COG categories of the proteins in the AY-WB phytoplasma genome with those in other mollicutes genomes

a

COG categories of, A (RNA processing and modification), B (Chromatin structure and dynamics), Y (Nuclear structure), Z (Cytoskeleton), and W (Extracellular structure) do not apply to mollicutes.

106

ATP-binding protein

AY-WB Membrane protein

amino acid

glnQ (AYWB636)

AYWB637 (AYWB637)

D-methionine amino acid (Arginine)

metN (AYWB591)

AYWB589 (AYWB589) glnP (AYWB318)

nlpA (AYWB590)

abc (39938618)

amino acid (Glutamine)

artP (AYWB267)

artQ (AYWB268), artM (AYWB265)

artI (AYWB266)

glnQ (39938974)

dppA (AYWB531)

dppD (39938678)

Substrate

Solute-binding protein

ATP-binding protein

OY-M Membrane protein

Solute-binding protein

Amino acid uptake

amino acid amino acid

glnQ (39938565),

artM (AYWB125)

amino acid

artM (39938563), artM (39938564) PAM134 (39938620) artM (39938942), artI (39938943), artM (39938950?) artM (39938973), artI (39938975), artM (39938976) artM (39939074?) artM (39938980?), artM (39938981?) artM (39939125?), mdoB (39939127?)

nlpA (39938619)

Dipeptide/oligopeptide uptake dipeptide or oligopeptide

107

dppF (AYWB529), dppD (AYWB530)

dppB (AYWB532), dppC (AYWB533)

oligopeptide

dppC (39938675), dppB (39938676) dppB (39938508), PAM023 (39938509)

oppA (39938677)

malK (39939238)

ugpE (39939236), ugpA (39939237)

ugpB (39939235)

cbiO (39938506) cbiO (39938665)

PAM19 (39938505) cibQ (39938666)

znuC (39938579)

znuB (39938580)

znuA (39938578)

potB (39939146), potC (39939147)

potD (39939148)

dppD (39938511), oppF (39938512)

PAM024 (39938510)

Sugar uptake sugar

malK (AYWB672)

malG (AYWB670), malF (AYWB671)

cbiO (AYWB014) cbiO (AYWB542), cbiO (AYWB543) mntA (AYWB625)

cbiQ (AYWB015) cbiQ (AYWB541)

malE (AYWB669)

Inorganic ion uptake cobalt ion cobalt ion Mn/Zn ion

mntB (AYWB624), mntB (AYWB623)

znuA (AYWB626)

Multidrug resistence multidrug multidrug

mdlB (AYWB028) mdlB (AYWB029)

mdlB (39938545)

Spermidine/putrescine uptake spermidine or putrescine

potA (AYWB095)

potB (AYWB094), potC (AYWB093)

potD (AYWB092)

potA (39939145)

Uncharacterized possible lipoprotein unknown

phnL (AYWB621) phnL (AYWB135)

Table 3.3 Summary of ABC transporter genes in AY-WB and OY phytoplasma genomes

phnL (39938582) phnL (39939085)

nlpA (39938583)

AY-WB

OY

Gene (Length, CDs)

Possible substrate

Gene (Length, Acc. no.)

Possible substrate

mgtA (920 aa, AYWB018) mgtA (817 aa, AYWB472) mgtA (952 aa, AYWB535) mgtA (892 aa, AYWB243) zntA (666 aa, AYWB652)

cation ion cation ion cation ion magnesium ion lead, cadmium, zinc, mercury

mgtA (920 aa, 39938516) mgtA (918 aa, 39938672) mgtA (1056 aa, 39938738) mgtA (892 aa, 39939071) zntA (666 aa, 39939219)

sodium/potassium ion calcium ion cation magnesium ion cadmium ion

Table 3.4 Summary of AY-WB phytoplasmas P-type ATPase

108

CDs b

Length Alignment to (aa) prototype c

d d Positives Gaps

DNA primase e AYWB179 AYWB288 AYWB618 AYWB220 AYWB047 AYWB086 AYWB087 AYWB048 AYWB172 AYWB079

1-441/1-441 1-440/1-440 1-369/73-441 1-441/1-441 1-156/1-156 11-174/84-247 7-119/247-359 1-98/301-398 1-68/92-159 1-37/1-37

100% 97% 100% 86% 99% 96% 99% 100% 96% 100%

none none none none none none none none none none

124 104 104 104 104 104 111 104 111 63 51

100% 99% 99% 99% 93% 84% 84% 83% 78% 86% 77%

none none none none none none 0% none 0% none 1%

100% 95% 97% 92% 91% 84% 94% 97% 82%

none none none none none 1% none none none

1-337/1-337 40-157/1-118 1-203/135-337

100% 96% 93%

none none none

1-210 / 1-210 1-210 / 1-210 1-209 / 1-210 1-209 / 1-209 1-54 / 33-86 1-62 / 70-131

100% 95% 83% 80% 86% 73%

none none 0% none none none

100% 100% 93%

none none none

Hypothetical protein

100% 96% 96%

none none none

Conserved hypothetical protein

1-124 / 1-124 1-103 / 1-103 1-103 / 1-103 1-103 / 1-103 1-103 / 1-103 1-102 / 1-102 1-104 / 1-104 1-103 / 1-103 1-103 / 1-103 2-62 / 43-103 1-51 / 1-50

768 151 173 120 89 93 56 51 69

1-768 / 1-768 1-151 / 618-768 30-156 / 635-761 1-96 / 488-583 1-87 / 394-480 2-69 / 695-761 1-56 / 1-56 1-44 / 259-302 1-59 / 373-431

337 157 203

Thymidylate kinase AYWB074 AYWB182 AYWB285 AYWB223 AYWB154 AYWB197

210 210 209 210 59 73

Bacterial nucleoid DNA binding protein AYWB275 AYWB193 AYWB385

122 110 83

1-122 / 1-122 1-110 /13-122 1-82 / 13-94

Bacterial nucleoid DNA binding protein AYWB231 AYWB211 AYWB310

96 96 96

1-96 / 1-96 1-96 / 1-96 1-96 / 1-96

AYWB161 AYWB082

157 156

1-157/1-157 1-156/1-157

100% 98%

none 0%

1-224/1-224 1-224/1-224 1-149/76-224 1-77 / 124-199

100% 98% 92% 84%

none none none 1%

1-77 / 1-77 1-63 / 1-63

100% 88%

none none

1-103 / 1-103 1-61 / 1-59 3-60 / 46-103

100% 86% 89%

none 3% none

1-54 / 1-54 1-54 / 1-54

100% 100%

none none

1-285/1-285 1-47/1-47 1-74/133-206 6-54/215-263

100% 97% 91% 73%

none none none none

100% 90% 86% 91% 87% 82%

none 0% none none 1% none

100% 98% 89% 80% 95%

none none none none none

1-250 / 1-250 1-208 / 42-250 1-208 / 42-250 1-166 / 42-208 1-103 / 42-145 16-131 / 1-116 15-112 / 141238 3-49 / 193-239

100% 94% 93% 93% 92% 77% 79%

none 0% 0% 0% 0% 1% none

97%

none

1-285 / 1-285 1-274 / 1-274 1-52 / 1-52

100% 99% 95%

none none none

100% 87% 78%

none 0% none

Conserved hypothetical protein AYWB162 AYWB083 AYWB361 AYWB166

224 224 149 92

Hypothetical protein AYWB023 AYWB158

77 63

AYWB081 AYWB158 AYWB159

103 63 60

Hypothetical protein AYWB080 AYWB157

54 54

Hypothetical protein AYWB186 AYWB396 AYWB395 AYWB394

285 81 75 56

Conserved hypothetical protein

ATP-dependent Zn protease AYWB214 AYWB378 AYWB377

Positives d Gaps d

Conserved hypothetical protein

ATP-dependent Zn protease AYWB230 AYWB350 AYWB344 AYWB351 AYWB352 AYWB071 AYWB354 AYWB353 AYWB170

Length Alignment to (aa) prototype c

Conserved hypothetical protein 441 441 369 441 156 175 119 98 70 50

Single-stranded DNA binding protein pIII-b pIV-c pII-b pI-e AYWB150 AYWB274 AYWB381 AYWB194 AYWB233 AYWB384 AYWB210

CDs b

AYWB183 AYWB224 AYWB284 AYWB198 AYWB199 AYWB389

212 209 122 108 54 56

1-212/1-212 1-209/1-208 1-120/91-210 1-107/91-197 1-54/1-55 1-42/91-132

Conserved hypothetical protein AYWB620 AYWB181 AYWB286 AYWB222 AYWB075

248 203 203 201 132

1-248 / 1-248 6-203 / 51-248 6-203 / 51-248 4-201 / 51-248 6-132 / 51-177

Conserved hypothetical protein AYWB289 AYWB177 AYWB366 AYWB616 AYWB049 AYWB173 AYWB217

250 208 208 188 106 134 120

AYWB387

60

AYWB178 AYWB617 AYWB219 AYWB276 AYWB192 AYWB024

285 278 54 147 146 129

1-147 / 1-147 1-146 / 1-147 12-120 / 31-139

(Continued) Table 3.5 Summary of the redundant (either functional or non-functional) genes in the AY-WB phytoplasma genome a

109

Table 3.5 (continued)

CDs b

Length Alignment to (aa) prototype c

Positives

d

Gaps d

Site-specific DNA methylase AYWB382 AYWB208 AYWB209

294 125 107

1-294/1-294 1-125/14-139 1-107/188-294

CDs b

500 498 498 471 99 87 81 97 86 60 55 57

1-500/1-500 2-498/4-500 2-498/4-500 2-470/4-472 3-99/342-438 1-87/367-453 1-81/420-500 3-86/416-500 1-86/174-259 2-60/4-62 3-55/319-371 1-54/255-308

none 0% none

100% 95% 92% 94% 96% 96% 98% 94% 88% 94% 89% 83%

none none none none none none none 1% none none none none

100% 95% 79% 82% 86% 94% 76% 84% 70%

none none none 2% none none 2% none none

100% 88% 78%

none 0% none

pII-e pIV-f pI-b pIII-f AYWB408

208 208 201 144 120 103 90 60 63

1-208/1-208 1-208/1-208 1-199/1-199 1-143/1-145 1-111/1-111 1-83/1-83 2-89/70-159 4-60/38-94 10-63/125-178

Replication initiator protein pII-a pIV-a AYWB406

a b

c d

e

382 375 58

1-382 / 1-382 1-363 / 1-363 15-53 / 218-256

156 156 156 138 121

1-156 / 1-156 1-156 / 1-156 1-156 / 1-156 1-138 / 1-141 1-94 / 1-94

100% 99% 96% 70% 81%

none none none 6% none

1-202 / 1-202 1-202 / 1-202

100% 93%

none none

1-277 / 1-277 23-240 / 1-218

100% 94%

none none

1-100 / 1-100 1-100 / 1-100

100% 93%

none none

1-197 / 1-197 1-146 / 52-197

100% 99%

none none

1-160 / 1-160 1-160 / 1-160 1-161 / 1-159 1-50 / 1-50

100% 98% 77% 84%

none none 2% none

100% 91% 89%

none none none

100% 92% 99%

none none none

Conserved hypothetical protein 202 AYWB379 AYWB213 202 Hypothetical protein 277 AYWB215 AYWB376 240

Hypothetical protein AYWB203 AYWB370

100 100

Hypothetical protein pI-d pIII-c

RNA polymerase sigma factor AYWB195 AYWB362 AYWB234 AYWB383 AYWB207 AYWB357 AYWB293 AYWB045 AYWB044

Positives d Gaps d

Hypothetical protein 100% 83% 95%

Replicative DNA helicase AYWB287 AYWB180 AYWB221 AYWB619 AYWB078 AYWB392 AYWB046 AYWB341 AYWB077 AYWB076 AYWB340 AYWB643

Length Alignment to (aa) prototype c

197 146

Hypothetical protein pIV-e pII-d pI-c pIII-d

160 160 161 50

Conserved hypothetical protein AYWB240 AYWB201 AYWB200

235 53 96

1-235/1-235 1-51/1-51 1-95/133-227

Conserved hypothetical protein AYWB227 AYWB347 AYWB346

206 149 58

1-206/1-206 1-141/1-141 1-58/149-206

The CDs in this table include the CDs on plasmids. The redundant genes include the pesudogenes. The CDs were represented with the names of CDs in the chromosome and the plasmids. For example, pI-a represents the first CD in plasmid I. Plasmids I, II, III, and IV are pAYWB1031, pAYWB1059, pAYWB1063, and pAYWB1110, respectively. The prototypes were the first CDs that were underlined in each group. The alignments were presented in the format of start-end (CDs) / start-end (prototype). The identities, positives and gaps were the percentage of identical amino acid residues, positive amino acid residues and non-aligned amino acid residues in the aligned sequences between the CDs and the prototypes. "none" stands for "no gaps". 0% means there are gaps, but the percentage is below 1%. The annotation was according to the prototype. The CDs in the group do not necessarily have the same annotation with the prototype since some CDs are only part of the prototypes.

110

CDs

SP a Length Probability (aa)

IP

MW Cleavage Annotation (Da) positions

Localization in plant b

OY-M orthologs

nucleus microbody cytoplasm chloroplast stroma cytoplasm cytoplasm cytoplasm cytoplasm cytoplasm n/a chloroplast stroma cytoplasm plasma membrane cytoplasm cytoplasm cytoplasm cytoplasm cytoplasm cytoplasm cytoplasm nucleus cytoplasm endoplasmic reticulum

n/a 39939004 39938818 39938556 39939149 n/a 39939096 n/a n/a n/a 39938972 39938905 39939062 39939048 39938535 n/a 39938550 n/a n/a 39939068 n/a 39939027 39938975

cytoplasm nucleus nucleus cytoplasm cytoplasm cytoplasm cytoplasm cytoplasm microbody cytoplasm cytoplasm cytoplasm cytoplasm nucleus cytoplasm nucleus nucleus cytoplasm cytoplasm

39938886 39939180 39939176 n/a n/a 39938930 39938832 39938818 n/a n/a n/a 39939026 39939063 n/a 39939048 39938878 39939176 39938774 39938727

cytoplasm cytoplasm n/a cytoplasm

39938677 39938643 n/a 39938619

cytoplasm microbody chloroplast thylakoid membrane cytoplasm

39938578 39939176 n/a

a

AYWB022 AYWB032 AYWB033 AYWB073 AYWB091 AYWB127 AYWB145 AYWB146 AYWB148 AYWB152 AYWB169 AYWB190 AYWB204 AYWB213 AYWB225 AYWB226 AYWB230 AYWB237 AYWB238 AYWB246 AYWB259 AYWB260 AYWB266

0.988 1 1 0.98 0.994 0.999 0.813 0.547 0.973 0.688 1 0.67 0.8 0.977 0.996 0.941 0.728 0.965 0.809 1 0.98 0.655 0.611

125 135 117 186 106 206 154 198 230 61 192 264 269 202 124 230 768 192 131 199 77 259 291

10.73 7.66 10.53 11.37 8.85 8.36 9.90 9.25 9.91 10.61 5.85 9.83 4.48 7.64 9.77 8.36 9.78 7.09 10.02 9.89 10.23 7.27 9.08

15,165 15,844 13,925 21,909 12,483 23,611 17,765 23,558 27,041 7,399 22,132 31,252 31,561 23,097 14,480 26,992 89,434 22,711 15,609 23,659 9,247 31,290 32,560

32 32 31 32 53 34 45 43 45 46 35 32 31 30 33 39 27 31 32 30 31 38 23

AYWB272 AYWB278 AYWB283 AYWB297 AYWB298 AYWB332 AYWB342 AYWB343 AYWB345 AYWB369 AYWB370 AYWB372 AYWB372 AYWB373 AYWB379 AYWB390 AYWB405 AYWB436 AYWB482

1 0.947 0.812 0.806 0.541 0.989 0.98 0.981 0.919 0.766 0.793 0.973 0.936 0.999 0.982 0.596 0.789 0.889 0.552

235 715 87 165 106 285 281 65 69 117 100 130 163 121 202 149 105 401 338

9.86 8.42 10.64 10.20 8.91 9.68 9.13 6.50 10.79 9.71 9.71 11.31 9.19 10.03 8.88 10.29 10.69 10.02 9.89

28,073 83,908 10,357 19,930 12,524 33,435 33,771 7,626 8,081 13,762 11,987 15,301 19,062 14,360 23,126 17,712 12,327 46,845 37,578

40 30 34 32 32 39 31 32 39 32 32 37 32 31 30 36 34 49 29

AYWB531 AYWB564 AYWB570 AYWB590

0.617 0.657 0.718 0.816

513 311 55 348

8.37 10.22 10.30 9.31

59,420 36,578 6,229 40,170

24 41 36 42

AYWB626 AYWB642 AYWB647

0.984 0.833 0.917

380 92 268

7.95 43,443 10.70 10,887 8.98 31,443

34 32 39

AYWB669

0.75

546

9.35 64,099

33

Hypothetical protein Conserved hypothetical protein Conserved hypothetical protein Conserved hypothetical protein Hypothetical protein Hypothetical protein Conserved hypothetical protein Hypothetical protein Hypothetical protein Hypothetical protein Conserved hypothetical protein Conserved hypothetical protein Conserved hypothetical protein Conserved hypothetical protein Conserved hypothetical protein Hypothetical protein hflB, ATP-dependent Zn protease Hypothetical protein Hypothetical protein Conserved hypothetical protein Hypothetical protein Conserved hypothetical protein artI, ABC-type amino acid transport system, amino acid binding protein Conserved hypothetical protein hflB, ATP-dependent Zn protease Conserved hypothetical protein Hypothetical protein Hypothetical protein Conserved hypothetical protein Conserved hypothetical protein Hypothetical protein Hypothetical protein Hypothetical protein Hypothetical protein Conserved hypothetical protein Conserved hypothetical protein Hypothetical protein Conserved hypothetical protein Conserved hypothetical protein Conserved hypothetical protein Conserved hypothetical protein gpsA, glycerol 3-phosphate dehydrogenase dppA, dipeptide-binding protein Conserved hypothetical protein Hypothetical protein nlpA, ABC-type methionine transport system, periplasmic component znuA, Manganese-binding protein Conserved hypothetical protein Hypothetical protein malE, putative MalE (maltose/maltodextrin-binding protein)

39939235

(Continued) Table 3.6 Summary of the secreted proteins identified in AY-WB phytoplasma genome

111

Table 3.6 (Continued) a

The SP (signal peptide) probability and cleavage positions in the amino acid sequences were predicted by the SignalP program (version 3.0) (Bendtsen et al., 2004). b The potential localization of the proteins in plants was predicted by pSORT program (Nakai and Horton, 1999) with the proteins excluding signal peptides. Some proteins are too small for prediction after the cleavage of signal peptides.

112

Fig. 3.1 Maps of the AY-WB phytoplasma plasmids. The sizes of the plasmids were drawn to scale. The CDs and their orientations were indicated by arrows. CDs sharing similarities were marked in the same color. Single stranded DNA binding protein genes (ssb, in brown) and genes encoding conserved hypothetical proteins (in orange, navy and dark green) are present in all plasmids. pAYWB1031 and pAYWB1063 have similar genes (rep, Sapphire blue) encoding replication initiation proteins that are similar to germinivirus replication proteins. pAYWB1059 and pAYWB1110 have similar replication protein-encoding genes (rep2, in black). ORF7 in pAYWB1063 (in dark gray) and ORF2 in pAYWB1110 (in light gray) have no similarity to any other proteins.

113

Fig. 3.2 The summary of the AY-WB phytoplasma genome-encoded transporters and central metabolic pathways. The question marks (?) indicated that the substrates of the transporters were unknown. The numbers next to the transporters indicated that copies of the same transporter in the AY-WB phytoplasma genome. The enzyme commission (EC) number of each enzyme was indicated in the reaction the enzyme catalyzed. Only central metabolic pathways were illustrated.

114

Fig. 3.3 AY-WB phytoplasma genome contains more repetitive sequences, both tandem repeats and inverted repeats, than the OY phytoplasma genome. However, the repeats in the OY phytoplasma genome are larger than those in the AY-WB phytoplasma genome. The prediction was done by using the REPuter program (Kurtz and Schleiermacher, 1999) hosted at the Bielefeld University Bioinformatics Server. The color coding of the size of the repetitive sequences was indicated at the bottom of each section.

115

Fig. 3.4 The AY-WB phytoplasma genome contained many copies of transposases genes and derivatives. The ruler was in the unit of bp (base pairs). The identities, positives and gaps of the comparison of the derivatives to the transposase gene (tra5) (in red) were indicated at right. The lengths of the CDs were indicated in the parentheses next to the names. The numbers at the beginning and the end of the CDs indicated the start and the end of the alignments with the transposase gene AYWB176.

116

Fig. 3.5 The AY-WB phytoplasma genome contained one copy of complete ATP-dependent DNA helicase and multiple copies of pseudogenes. The ruler was in the unit of bp (base pairs). The identities, positives and gaps of the comparison of the derivatives to the ATP-dependent DNA helicase (AYWB085, uvrD) (in red) were indicated at right. The lengths of the CDs were indicated in the parentheses next to the names. The numbers at the beginning and the end of the CDs indicated the start and the end of the alignments with the uvrD gene AYWB085. The AYWB294-296, AYWB035-041, AYWB639-641, AYWB644-645, and AYWB020-021 were disrupted partial ATP-dependent DNA helicase pseudogenes. Others were pseudogenes partially aligned to the uvrD gene.

117

CHAPTER 4 Identification and characterization of traE genes of Spiroplasma kunkelii Xiaodong Bai, Tatiana Fazzolari and Saskia A. Hogenhout *

Department of Entomology, The Ohio State University – Ohio Agriculture and Development Center (OARDC), Wooster, OH 44691

118

4.1 Abstract Four traE homologs, designated traE1, traE2, traE3 and traE4, were identified and amplified from the genome of leafhopper-transmitted corn stunt pathogen Spiroplasma kunkelii, and predicted to encode membrane-bound ATPases. Deduced proteins of all traE genes have 62.3% to 89.9% similarity to the conserved VirB4 domain that are frequently components of type IV secretory pathways involved in intracellular trafficking and secretion of DNA and proteins. In phylogenetic analysis, TraE homologs of S. kunkelii, Mycoplasma pulmonis and M. fermentans cluster together and are more similar to TraE proteins of Gram-positive bacteria than to those of Gram-negative bacteria, thereby resembling the 16S rRNA phylogeny. Gene traE2 was most conserved, whereas the presence of other three traE genes varied among S. kunkelii strains, M2, CS2B, FL-80 and PU8-17. Further, traE1 and traE2 appeared to be located on the chromosome, and traE3 and traE4 genes on plasmids of S. kunkelii strain M2. Transcripts of the spiralin gene and traE2 genes were detected on Northern blots containing total RNA of S. kunkelii cultures, and S. kunkelii-infected plants and insects, in which traE2 appeared to be of a larger transcription unit. Full-length expression products of the other traE genes were not detected. The possibility that S. kunkelii traE genes are part of regions involved in S. kunkelii cell morphogenesis, adhesion and DNA recombination is discussed. This is the first study providing the localization of traE genes on spiroplasma plasmids and the expression pattern in various spiroplasma environmental niches.

119

4.2 Introduction Spiroplasma kunkelii, the causal agent of corn stunt disease, causes economically significant yield losses of corn on the American continent (Hruska and Gomez-Peralta, 1997). S. kunkelii belongs to the genus Spiroplasma of the Class Mollicutes, of which members are believed to be diverged from a Gram-positive ancestor. Mollicutes apparently underwent extensive gene loss events resulting in reduced genome sizes and loss of peptidoglycan cell walls (Bové et al., 1989; Gasparich, 2002). S. kunkelii and two other members of the genus Spiroplasma, S. citri and S. phoeniceum, are insecttransmitted plant pathogens that replicate in both insect vectors and plant hosts. Spiroplasmas are mostly helical in culture media and plant phloem tissues. However, in insects, they appear to be more often round and flask-shaped (Kwon et al., 1999; Özbek et al., 2003). Spiroplasmas employ various attachment structures to initiate contact with insect cells. Spiroplasma membranes contacting the extracellular lamina of insect epithelial and muscle cells appeared thickened (Fletcher et al., 1998; Özbek et al., 2003) and fimbriaelike appendages apparently protruding from the S. kunkelii cell surface seemed to attach to the external laminae of insect epithelial and muscle cells (Özbek et al., 2003). Furthermore, S. kunkelii cells appeared to be connected by pilus-like structures (Özbek et al., 2003). These attachment structures have not been observed in other members of the Class Mollicutes (Razin et al., 1998). This study was aimed to identify and characterize S. kunkelii gene(s) potentially involved in the biosynthesis of pili and/or fimbriae in the genome of S. kunkelii CR2-3x

120

and four other strains. We identified four homologs of transfer (tra) genes. traE genes have been shown to be involved in conjugation, type IV secretion and/or cell invasion in various bacteria (Censini et al., 1996; Lai and Kado, 2000).

4.3 Materials and Methods 4.3.1 Culturing of S. kunkelii strains S. kunkelii strains M2 from Poza Rica, Mexico (Bai and Hogenhout, 2002; Ebbert and Nault, 2001), CS-2B from California, FL-80 from Florida, and PU8-17 from Peru (Lee and Davis, 1989) were cultured in LD8A3 medium as described before (Lee and Davis, 1989). Spiroplasmas cultures were harvested at log growth phases.

4.3.2 Computational analysis Genome sequences of S. kunkelii strain CR2-3x were obtained from the genome sequence project website hosted by the Advanced Center for Genome Technology at University of Oklahoma (http://www.genome.ou.edu/spiro.html). This strain was originally isolated from leaves of a maize plant (Zea mays L.) naturally infected with S. kunkelii in Costa Rica (Zhao et al., 2003). Putative Open Reading Frames (ORFs) beginning with start codon ATG and ending with stop codons TAG and TAA (Bové et al., 1989) were predicted using a Windows-based ORF extractor program (http://www.oardc.ohio-state.edu/mcic/bioinformatics/bio_software/). Translated protein sequences were searched against the NCBI nr database and various customized databases consisting of genes involved in virulence, conjugation and/or pili formation using stand-

121

alone BLAST v2.0 (Altschul et al., 1997) on a local Linux workstation. The large BLAST output text files were parsed into tab-separated formats using a Perl script and imported into the database program FileMaker®. Contigs containing sequences of interest were analyzed using MacVector® program. The NCBI pairwise BLAST was used for nucleotide and amino acid comparisons. Gene codon usage was analyzed by the Chips program (Wright, 1990) in EMBOSS. The alignment of traE genes deduced protein sequences was produced using ClustalW (version 1.8) (Thompson et al., 1994). Protein transmembrane domains were identified using the TMHMM2.0 program (Sonnhammer et al., 1998) and repeats in amino acid sequences by the RADAR program (Heger and Holm, 2000) from EBI (European Bioinformatics Institute). The SAM-T02 program (Karplus et al., 2001) was used for domain searches. The alignment generated by ClustalX (version 1.81) (Thompson et al., 1997) was used for phylogenetic tree construction using PAUP* 4.0 (Swofford, 2001).

4.3.3 Genomic DNA isolation Harvested S. kunkelii culture was pelleted at 26,900 x g at 4°C for 30 min. Genomic DNA was extracted using the Genomic-tip 100/G kit (Qiagen, Inc., Valencia, CA, USA) following the manufacturer’s protocol. DNA were resuspended in sterile water and stored at -20ºC.

122

4.3.4 Amplification and sequence analysis In order to sequence full-length traE genes designated traE1, traE2, traE3 and traE4 from S. kunkelii strain M2, overlapping fragments were sequenced and included upstream and downstream regions of traE genes. The following primers were used: 5'gggttaaattagatatgaaaag3' (traE1 forward1); 5'gtcctgaatagtattaattg3' (traE1 reverse1); 5'taaaaaaaattattttatagcaaatg3' (traE1 forward2); 5'gcaaaaatttaacaaaattcttc3' (traE1 reverse2); 5'cctttttgtgataaaggaaaac3' (traE2 forward1); 5'gtggatagattggataattttg3' (traE2 reverse1); 5'gtataaatgatgcgcaattttc3' (traE2 forward2); 5'gtctttccatttttcttccct3' (traE2 reverse2); 5'taagaaaaaagataaaggagac3' (traE3 forward1); 5'ttaatatgagttggtaattttgta3' (traE3 reverse1); 5'tcataccaacaaatacagcaat3' (traE3 forward2); 5'gttcagttattttgttttactttc3' (traE3 reverse2); 5'tagtgttttaaaacaagaaaacat3' (traE4 forward1): 5'ctttgttgtatttgttggtataat3' (traE4 reverse1); 5'ttataccaacaaatacaacaaag3' (traE4 forward2); 5'gttcagttattttgttttacttcc3' (traE4 reverse2). Amplification products were sequenced from both strands using 64-lane Perkin-Elmer ABI377 Prism DNA sequencing machine and ABI BigDye Terminator Reaction kit (Applied Biosystems, Inc., Foster City, CA, USA). Sequence quality was assessed by MacPhred-MacPhrap. Low quality sequences (quality score < 20) were trimmed and sequence tags were assembled using the SequencerTM (version 4.1) software.

4.3.5 Restriction fragment analysis Genomic DNAs (0.25 µg) were digested with AluI, EcoRV or TaqI (Promega, Inc., Madison, WI, USA). Restriction fragments were size-separated on a 1% agarose gel

123

and transferred to BrightStar Plus Positively Charged nylon membranes (Ambion, Inc., Austin, TX, USA) by capillary transfer following standard procedures (Sambrook et al., 1989). DIG-labeled probes were generated by PCR using primers of traE1 forward1 and reverse1, traE2 forward1 and reverse1, and traE3 forward1 and reverse1 following the protocol of PCR DIG Probe Synthesis Kit (Roche Diagnostics Co., Basel, Switzerland). Prehybridization, hybridization, washing and detection were performed at 42°C following the instruction for DIG Wash and Block Buffer Set (Roche Diagnostics Co.).

4.3.6 Pulsed field gel electrophoresis, Southern blotting and hybridization Agarose sample blocks for each S. kunkelii strain were prepared from 30 ml cultures. Bacteria were pelleted by centrifugation at 26,900x g and bacterial pellets were used for plug preparation and pulsed field gel electrophoresis following the description in the section for preparation of agarose embedded bacterial DNA of the CHEF-DR® III Pulsed Field Electrophoresis Systems Instruction Manual (Bio-Rad Laboratories, Inc., Hercules, CA, USA). The parameters of Pulsed Field Gel Electrophoresis (PFGE) are: 60 sec of initial switch time, 120 sec of final switch time, 120º angle, 4 V cm-1 voltage gradient, 14ºC, 20 h of run time. Pump dial set was set to 70 (~ 0.75 l min-1) to keep circulation of running buffer. Southern blots of PFGE gels were obtained by capillary transfer following standard procedures (Sambrook et al., 1989). PCR products of primers described in section 4.3.5 were labeled with Random Primers DNA Labeling System (Invitrogen, Inc., Carlsbad, CA, USA) using [α-32P]-dCTP. Hybridization was performed

124

following suggested protocol of Random Primers DNA Labeling System (Invitrogen, Inc.).

4.3.7 Northern blot hybridization Total RNA was isolated from S. kunkelii M2 culture, S. kunkelii M2 infected insects and plants, and non-infected insects and plants following the instruction manual of ToTALLY RNA Extraction Kit (Ambion, Inc.). 10 µg total RNA from insects and 1µg total RNA from culture and plants were mixed with glyoxal load dye (Ambion, Inc.) at 1:1 (v/v) ratio. The mixture was incubated at 50°C for 30 min, followed by cooling on ice. RNA samples were separated on 1.4% agarose gels prepared in 1 x BPTE (10 mM PIPES, 30 mM Bis-Tris, 1 mM EDTA, pH 6.5) at 70 mV for 2 h. RNA was transferred to BrightStar-Plus Positively Charged Nylon membrane (Ambion, Inc.) by capillary transfer following standard procedure (Sambrook et al., 1989). RNA on blots was immobilized by exposure to UV light for 3 min. Synthesis of [α-32P]-dCTP-labeled probes and hybridization were performed as described in section 4.3.6.

4.3.8 Nucleotide sequence accession numbers The nucleotide sequence data of traE1, traE2, traE3, and traE4 genes were deposited in the GenBank database under the accession number of AY233334, AY233335, AY23336, and AY23337, respectively.

125

4.4 Results 4.4.1 Identification of four traE homologs in S. kunkelii CR2-3x genome The discovery of S. kunkelii fimbriae and pili (Özbek et al., 2003) prompted the search of genes potentially involved in fimbriae formation and conjugation in the gapped genome sequence of S. kunkelii strain CR2-3x (http://www.genome.ou.edu/spiro.html). Computer analysis resulted in identification of four complete traE ORFs of 2,520 to 2,697 nucleotides in length with significant protein sequence similarities (E value < 1e-5) to other traE homologs, including TrsE of Mycoplasma pulmonis (NP_326214, Chambaud et al., 2001), four M. fermentans TraE homologs (AAN85227, AAN85273, AAN85276, AAN85277, Calcutt et al., 2002), Lactococcus lactis TraE (NP_047296, Doughtery et al., 1998), B. anthracis pX02.09 (NP_053164, Okinaka et al., 1999), Staphylococcus aureus TrsE (E36891, Morton et al., 1993), Helicobacter pylori CagE (AAF80209, Censini et al., 1996), and Agrobacterium tumefaciens VirB4 (CAA29975, Thompson et al., 1988). Most of these TraE homologs are part of conjugation or type IV secretion systems and some are encoded by genes on plasmids. The four S. kunkelii traE homologs were designated traE1, traE2, traE3 and traE4. A fifth traE-like locus spanning two adjacent ORFs (a and b) was also identified. This locus with a total length of 972 nucleotides was considerably shorter than the four identified S. kunkelii traE genes. Interestingly, both ORFs were most similar to TraE1 in that the deduced protein sequence of ORFa shared 97% similarity to residues 738-828 of the TraE1 C-terminal (Ct) region, and ORFb 92% to residues 49-163 of the TraE1 Nterminal (Nt) region. Further, the deduced protein sequence between ORFa and ORFb

126

shared 86% similarity to residues 616-735 located between Ct and Nt of TraE1. Thus, apparently this fifth traE-like sequence underwent inversion/deletion events and was excluded from further analysis because it is unlikely to encode a functional protein. No additional traE homologs were identified in Blast searches of the five traE sequences against updated S. kunkelii CR2-3x genome sequences of March 18, June 8 and October 13 of 2002, and April 26 of 2003.

4.4.2 Sequence features of S. kunkelii traE genes Sequences of S. kunkelii M2 traE1, traE2, traE3 and traE4 PCR products showed 100% nucleotide similarities with corresponding genes of S. kunkelii CR2-3x. Sequence features of S. kunkelii traE genes were summarized in Table 4.1. The gene lengths ranged from 2,520 nucleotides of traE3 and traE4 to 2,697 nucleotides of traE1. The GC content of the traE1 gene was 23%, which is identical to that of the gapped genome sequences of S. kunkelii CR2-3x. The GC contents of traE3 and traE4 were 28%. The codon usage (Nc) values ranged from 33.40 for traE2 to 40.85 for traE3, indicating traE2 has the strongest codon bias and traE3 has the lowest codon bias. A stronger codon bias adds more confidence that the ORF is more likely to be transcribed at a higher efficiency. Pairwise BLAST analysis of deduced protein sequences of the four traE genes showed sequence similarities of 43% to 96%, in which TraE3 and TraE4 shared the highest similarity of 96%. Several conserved regions were identified in the deduced protein sequences of S. kunkelii traE genes (Fig. 4.1). The domain study program SAM-T02 (Karplus et al.,

127

2001) identified significant similarities (E value < 1e-10) to ATPase domains of PDB entry 1e9rA, including the highly conserved DEAH box, in the C-terminal portions of TraE1 (amino acids 513-832), TraE2 (amino acids 527-845) and TraE3 and TraE4 (amino acids 474-782). Further, three conserved transmembrane domains were predicted in the N-terminal regions of all four proteins (Fig. 4.1). These data suggested that all S. kunkelii M2 TraE proteins are membrane-bound ATPases, and that unlike most TraE proteins, the C-terminal portions of approximately 800 amino acids appear to be located extracellularly. Conserved Domain (CD) search performed at NCBI website revealed that deduced proteins of all traE genes have 62.3% to 89.9% similarity to the conserved VirB4 domain. Proteins containing VirB4 domains are frequently components of type IV secretory pathways involved in intracellular trafficking and secretion (Hofreuter et al., 2001; Li et al., 1999). Phylogenetic analysis revealed that the TraE homologs of mollicutes cluster together and are more similar to TraE proteins of Gram-positive bacteria than to those of Gram-negative bacteria (Fig. 4.2), thereby resembling the 16S rRNA phylogeny (Gasparich, 2002).

4.4.3 Presence of traE1, traE2, and traE3/4 among four S. kunkelii strains The presence of traE1, traE2, traE3 and traE4 was investigated in the four S. kunkelii strains M2, CS-2B, FL-80 and PU8-17 by Southern blot hybridization. These four strains were selected because of their differences in collection sites and culturing history, and consequently may differ in the presence of traE genes. DIG-labeled probes corresponding to the 5' halves of traE1, traE2 and traE3/4 genes (Fig. 4.3A) were

128

synthesized as described in section 4.3.5. Restriction enzymes used for genomic DNA digestion were EcoRV, AluI, and TaqI for traE1, traE2 and traE3/4 hybridization, respectively (Fig. 4.3A). The traE3/4 probe hybridized to both traE3 and traE4 genes because of the high sequence similarity between them (Table 4.1). However, traE3 and traE4 can be distinguished by TaqI digestion patterns (Fig. 4.3A). S. kunkelii M2, CS-2B and FL-80, but not PU8-17, had at least one copy of a traE1 homolog (Fig. 4.3A). The S. kunkelii M2 EcoRV digestion pattern matched that of the in silico digestion pattern of S. kunkelii CR2-3x. A fourth EcoRV site was present 3,509 bp upstream of the EcoRV site in the 5' region of traE1, matching the ~ 3,500 bp band of S. kunkelii M2, CS-2B and FL-80. However, FL-80 lacked the 900 bp EcoRVEcoRV fragment suggesting that FL-80 lacks the 900 bp EcoRV-EcoRV fragment and strain CS-2B had an extra 2.8 kb fragment probably derived from a second copy of traE1. S. citri seemed to harbor a traE1 homolog. traE2 appeared to be present in the genomes of all tested S. kunkelii strains (Fig. 4.3B). A band of ~ 1,300 bp was detected in all strains and matched the in silico AluI digestion of S. kunkelii CR2-3x because an AluI site was detected at 1,312 bp upstream of the AluI site of position 261 in the 5’ region of traE2. The 402 and 351 bp AluI-AluI bands were not visible in this experiment as they were too small. The traE2 probe hybridized to one (M2) or two (CS-2B, FL80, PU8-17) bands that did not match the in silico AluI digestion pattern of traE2. Because the hybridization signal was weaker than that of the ~ 1,300 bp fragment, it is likely to be the result of cross-hybridization to another traE2-like sequence that has not yet been sequenced from S. kunkelii CR2-3x.

129

M2 and FL-80 genomes harbor identical copies of traE3 and traE4, whereas CS2B seemed to contain only traE4 and PU8-17 contained neither (Fig. 4.3B). The in silico TaqI digestion patterns of S. kunkelii M2 traE3 and traE4 matched the hybridized bands on Southern blots. A TaqI site was located 359 bp upstream of traE3 matching the 941bp fragment (Fig. 4.3B). TaqI sites located 3,461 bp upstream and 1798 downstream of traE4 gene explained the 3,756 bp and 2,275 bp fragments (Fig. 4.3B). Apparently, S. kunkelii M2, FL-80 and CS-2B but not PU8-17 possesses a third homolog of traE3 and traE4 because one hybridization band of approximately 1,500 bp did not match the digestion patterns of traE3 or traE4 genes.

4.4.4 Localization of traE3 and traE4 genes on spiroplasma plasmids The high variability of presence of traE genes among S. kunkelii strains gave rise to the question whether traE genes are located on extrachromosomal DNA. Southern blots prepared from the pulsed field and standard agarose gels were hybridized to the traE1, traE2 and traE3/4 probes depicted in Fig. 4.3A. Pulsed field gel profile revealed significant differences in genome sizes of M2, CS-2B and PU8-17. The 1.6 Mb genome size of M2 (Fig. 4.4A, lane 1) is similar to the predicted 1.6 Mb genome of CR2-3x of which 1.54 Mb has been sequenced so far. The genomes of CS-2B (2.1 Mb) and PU8-17 (2.0 Mb) were significantly larger than that of M2 (Fig. 4.4A, lanes 2 and 3). Hybridization profile of PFGE Southern blots with [α32

P]-dCTP labeled 16S rDNA confirmed that these large DNA fragments were

chromosomal DNA (data not shown).

130

Hybridization profiles of PFGE Southern blots showed that the traE1 probe hybridized to chromosomal DNA of M2 and CS-2B (Fig. 4.4B), and the traE2 probe to M2, CS-2B and PU8-17 (Fig. 4.4B), thereby confirming earlier results that PU8-17 contains a copy of traE2 but not of traE1 and demonstrating that traE1 and traE2 are apparently located on chromosomal DNA of S. kunkelii. In contrast, the traE3/4 did not hybridize to chromosomal DNA of strains M2 and CS-2B, but to several smaller fragments that are likely to represent various folding confirmations of circular plasmid DNA (Fig. 4.4B). Again, as expected the traE3/4 probe did not hybridize to PU8-17 genomic DNA. Further, M2 showed more traE3 hybridization bands than CS-2B (Fig. 4.4B, lanes 1 and 2) suggesting that traE3 and traE4 of M2 are located on different plasmids and/or the M2 and CS-2B plasmids differ in size. To confirm the PFGE hybridization results, plasmids were isolated from M2, CS2B and PU8-17 cultures, and size-separated on a standard agarose gel. Subsequent hybridizations of Southern blots of these gels confirmed the hybridization patterns shown in Fig. 4.4B (Fig. 4.4C). The traE1 and traE2 probes hybridized to residual chromosomal DNA that was co-purified with plasmid DNA (Fig. 4.4C), whereas the traE3/4 probe hybridized to smaller DNA fragments that appeared to be plasmids because they were significantly smaller than the chromosomal DNA bands (Fig. 4.4C). There were two additional smaller bands in M2 relative to CS-2B (Fig. 4.4C, compare lanes 1 and 2) supporting the conclusion that traE3 and traE4 of strain M2 may be located on different plasmids.

131

4.4.5 Expression of S. kunkelii M2 traE genes in culture, insects and plants To investigate whether traE genes are expressed in S. kunkelii M2 during infection, total RNAs isolated from S. kunkelii culture, and S. kunkelii-infected and noninfected leafhoppers and maize leaves were size-separated for Northern blot preparation. To assess RNA quality and to investigate whether spiroplasma gene expression can be detected in cultures and in insects and plants, Northern blots were first hybridized to a probe prepared on the spiralin gene sequence (Foissac et al., 1997). The spiralin gene is constitutively expressed and encodes a membrane-bound lipoprotein proposed to determine cell shape, cell motility and viability (Beven and Wroblewski, 1997). In contrast to 16S rDNA, the spiralin gene probe is unlikely to cross-hybridize to other bacteria because spiralin is unique to mollicutes and highly divergent in sequence among spiroplasmas species (Foissac et al., 1997). Northern blot hybridizations with the spiralin gene probe showed a single transcript of the expected size of 1.1 kb in culture and infected but not healthy RNA samples. For unknown reason, the spiralin transcript in infected insect samples was slightly larger than that in infected plant and culture samples (Fig. 4.5). The amounts of detected spiralin transcript appeared to be similar in infected insects and plants RNA but higher in cultured spiroplasmas. These results demonstrated that S. kunkelii transcripts were detectable in cultures, insects and plants. Further, the isolated RNA was of good quality without noticeable degradation. Subsequently, Northern blots were hybridized to the traE probes depicted in Fig. 4.3A. No traE1 transcripts were detected (data not shown), indicating that traE1 is not

132

expressed or its expression is below detection level. In contrast, several traE2 transcripts were detected in S. kunkelii cultures, and S. kunkelii infected insects and plants (Fig. 4.5B). An S. kunkelii transcript of ~10 kb was detected in insects, cultures and plants and is significantly larger than traE2 ORF of 2.2 kb, suggesting that traE2 is part of a larger transcript and my be located in an operon. A second transcript of ~3.1 kb that appeared to be transcribed at the same level as the ~10 kb transcript was detected in insects and may contain only the traE2 ORF (Fig. 4.5A, lane 2). However, in samples from S. kunkelii culture and infected plants, several additional transcripts were detected, including ~4 kb transcripts that appeared to have similar expression levels as the ~10 kb transcripts. Interestingly, the transcript of the ~4 kb was not detected in infected insect samples suggestive of differential regulation of traE2 transcription and flanking genes in insects relative to cultures and plants. Further, it appeared that the overall transcription level of the ~10 kb transcript containing traE2 in S. kunkelii culture and S. kunkelii-infected plant samples were comparable, but is lower in S. kunkelii-infected insects (Fig. 4.5B). Thus, it seemed that relative to the spiralin gene, transcription of S. kunkelii traE2 and surrounding genes was upregulated in cultures and in plants, but down-regulated in insects. The traE3/4 Northern hybridization profile showed two transcripts of 0.7 and 0.5 kb in S. kunkelii cultures (Fig. 4.5B, lane 3). The 0.7 kb was also detected in plants (Fig. 4.5B, lane 4), whereas no hybridization signals were detected in insects (Fig. 4.5B, lane 2). The function and origin of the 0.7 and 0.5 kb transcripts that are much smaller than the full-length traE3 and traE4 genes is unclear and awaits further investigation.

133

4.4.6 Genetic context of traE genes in S. kunkelii CR2-3x Because the Northern blot hybridization results suggested that traE2 might be part of larger transcripts of ~4 and ~10 kb in lengths, ORFs flanking traE2 were analyzed in the S. kunkelii CR2-3x genome sequence (Fig. 4.6). It seemed likely that traE2 was within a locus containing 8 genes, some of which encoded membrane proteins. The fifth ORF within this locus encoded a homolog of MreB, which forms a filamentous helical structure close to the cell surface of bacteria and has an actin-like role in bacterial cell morphogenesis (Jones et al., 2001). The genetic contexts of the other three traE genes were also investigated. This revealed that traE1 appeared to be part of a region that contained another conjugation gene (traK), recombination (pre), and restriction modification (dm) genes (Fig. 4.6). Similarly, traE3 and traE4 were immediately adjacent to five ORFs, four of which encoded proteins containing potential transmembrane domains (Fig. 4.6). Interestingly, the traE3 and traE4 regions had similar organizations and harbored homologs of S. citri and S. kunkelii adhesion related proteins SARP1 and SkARP1 that were approximately 89 kDa in size and contained signal peptides, repeated amino acid sequences and Cterminal transmembrane domains (Berg et al., 2001), and homologs of mob genes that were involved in recombination and mobilization of DNA (Cabezon et al., 1997).

4.5 Discussion We identified four traE homologs, which are 100% identical in gapped genome sequence of S. kunkelii CR2-3x and genome of S. kunkelii M2. These genes encoded

134

proteins containing transmembrane, ATPase and VirB4 domains. TraE homologs have been identified in some but not all mollicutes. Interestingly, the four traE homologs of Mycoplasma fermentans are part of integrative conjugal elements (Calcutt et al., 2002) suggesting that in mollicutes, traE homologs might also be involved in conjugation. However, the single membrane-bound ATPase gene traE (trsE) of Mycoplasma pulmonis (Chambaud et al., 2001) does not appear to be part of a conjugal element. Although TraE homologs were also reported for S. citri (Laigret et al., 2000), the sequences have not been deposited in a public database. We found an S. citri DNA fragment that hybridizes to traE1. No traE homologs were identified in the completed genome sequences of M. genitalium (Fraser et al., 1995), M. pneumoniae (Himmelreich et al., 1996), and Ureaplasma urealyticum (Glass et al., 2000). It appears that the phylogeny of TraE is similar to that of 16S rDNA sequences (Gasparich, 2002). The most plausible explanation of this phylogeny is that several mollicutes, such as M. genitalium, M. pneumoniae and U. urealyticum, lost their traE genes, and that mollicutes did not acquire their traE by horizontal gene transfer from other bacteria. Indeed, mollicutes are believed to have undergone various degrees of gene loss events in which spiroplasmas are evolutionary early mollicutes that suffered the least gene losses (Bai and Hogenhout, 2002; Razin et al., 1998). Southern blot results of the presence of traE genes among S. kunkelii strains demonstrated that traE2 is most conserved, whereas the presence of traE1, traE3 and traE4 genes is highly variable among S. kunkelii strains. Based on presence and restriction digestion patterns of traE, S. kunkelii M2 appeared to be most similar to S.

135

kunkelii CR2-3x, subsequently FL-80, then CS-2B and lastly PU8-17. This may be expected based on the original collection sites of these strains. However, FL-80, CS-2B and PU8-17 were brought into culture in 1988 and since then have not been introduced into insects and plants (Lee and Davis, 1989), whereas M2 was brought into culture recently (Bai and Hogenhout, 2002). Spiroplasmas undergo frequent genome reorganization (Ye et al., 1996) including variations in plasmid content and, therefore, CS-2B and PU8-17 may have lost the plasmids containing traE3 and/or traE4. Further, long-term culturing of spiroplasmas results in loss of insect transmissibility (Wayadande et al., 1993). It remains to be investigated whether CS-2B and PU8-17 can be transmitted by leafhoppers and/or are infectious to plants. Expression results suggested that traE2 is part of a 10 kb transcript, and genetic context showed that traE2 is part of a locus containing several predicted membrane proteins and mreB. Further, traE2 is located on the S. kunkelii chromosome, expressed in insects, plants and in culture, and present in all tested S. kunkelii strains, all of which implicated an indispensable role of traE2 in S. kunkelii. Interestingly, MreB is critical for the rod-shaped morphology of bacteria, and as previously observed (Bai and Hogenhout, 2002), mreB is present in the helical-shaped spiroplasmas but absent in the round, oval and flask-shaped mycoplasmas. A putative role of traE2 in S. kunkelii cell morphology should be confirmed as soon as transformation and mutagenesis system is available for S. kunkelii. The genetic context of traE3 and traE4 also suggested that these genes are part of operons that also contain homologs of the adhesin SARP1. The repeated amino acid

136

domain of SARP1 was shown to locate extracellularly, which is potentially involved in binding of SARP1 to insect cells (Berg et al., 2001). TraE’s are frequently part of conjugation systems and are primarily involved in pilus formation (Zatyka and Thomas, 1998) and structures that appear to be conjugation pili were observed to connect S. kunkelii cells to each other and to insect cells (Özbek et al., 2003). Thus, genes within tra3 and tra4 loci may be involved in the formation of attachment structures. The absence of detectable full-length transcripts of traE3 and traE4 in Northern hybridization may be explained by the notion that genes encoding plasmid functions such as regulation of replication, stable maintenance in the host population and conjugation are seldom constitutively expressed (Bingle and Thomas, 2001). Data described herein report for the first time the detection of transcripts of spiroplasmas grown in culture media and in their natural habitats, which are plants and insects. It is also the first detailed description of the presence of traE genes among Spiroplasma species. This work provides the basic knowledge for further research that confirms whether the traE genes are involved in cell morphogenesis, adhesion, conjugation and/or recombination.

4.6 Acknowledgments This work was funded by The Ohio State University - Ohio Agricultural Research and Development Center (OARDC) and Ohio Plant Biotechnology Consortium (OPBC). The authors wish to thank Ian Holford at the Molecular and Cellular Imaging Center (MCIC) for help with writing ORF extractor program, William Styer of Department of

137

Entomology at the OSU-OARDC for help with spiroplasma culture, insect and plant rearing, and Dr. Sophien Kamoun of Department of Plant Pathology at the OSU-OARDC for providing the radioactive facility. S. kunkelii gapped genome sequence was obtained from S. kunkelii Genome Sequencing Project funded by US Department of Agriculture, Agricultural Research Service and the authors wish to thank Dr. Robert E. Davis and colleagues.

4.7 References Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W. and Lipman, D.J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389-3402. Bai, X. and Hogenhout, S.A. (2002) A genome sequence survey of the mollicute corn stunt spiroplasma, Spiroplasma kunkelii. FEMS Microbiol. Lett. 210, 7-17. Berg, M., Melcher, U. and Fletcher, J. (2001) Characterization of Spiroplasma citri adhesion related protein SARP1, which contains a domain of a novel family designated sarpin. Gene 275, 57-64. Beven, L. and Wroblewski, H. (1997) Effect of natural amphipathic peptides on viability, membrane potential, cell shape and motility of mollicutes. Res. Microbiol. 148, 16375. Bingle, L.E. and Thomas, C.M. (2001) Regulatory circuits for plasmid survival. Curr. Opin. Microbiol. 4, 194-200. Bové, J.M., Carle, P., Garnier, M., Laigret, F., Renaudin, J. and Saillard, C. (1989) Molecular and cellular biology of spiroplasmas, pp. 243-364. In R. F. Whitecomb and J. G. Tully (ed.), The mycoplasmas, vol. 5. Academic Press, New York. Cabezon, E., Sastre, J.I. and de la Cruz, F. (1997) Genetic evidence of a coupling role for the TraG protein family in bacterial conjugation. Mol. Gen. Genet. 254, 400-406. Calcutt, M.J., Lewis, M.S. and Wise, K.S. (2002) Molecular genetic analysis of ICEF, an integrative conjugal element that is present as a repetitive sequence in the chromosome of Mycoplasma fermentans PG18. J. Bacteriol. 184, 6929-6941. 138

Censini, S., Lange, C., Xiang, Z., Crabtree, J.E., Ghiara, P., Borodovsky, M., Rappuoli, R. and Covacci A. (1996) cag, a pathogenicity island of Helicobacter pylori, encodes type I-specific and disease-associated virulence factors. Proc. Natl. Acad. Sci. USA 93, 14648-14653. Chambaud, I., Heilig, R., Ferris, S., Barbe, V., Samson, D., Galisson, F., Moszer, I., Dybvig, K., Wroblewski, H., Viari, A., Rocha, E.P.C. and Blanchard, A. (2001) The complete genome sequence of the murine respiratory pathogen Mycoplasma pulmonis. Nucleic Acids Res. 29, 2145-2153. Doi, M., Wachi, M., Ishino, F., Tomioka, S., Ito, M., Sakagami, Y., Suzuki, A. and Matsuhashi, M. (1988) Determinations of the DNA sequence of the mreB gene and of the gene products of the mre region that function in formation of the rod shape of Escherichia coli cells. J. Bacteriol. 170, 19-24. Ebbert, M.A. and Nault, L.R. (2001) Survival in Dalbulus leafhopper vectors improves after exposure to maize stunting pathogens. Entomol. Exp. Appl. 100, 311-324. Foissac, X., Bové, J.M. and Saillard, C. (1997) Sequence analysis of Spiroplasma phoeniceum and Spiroplasma kunkelii Spiralin genes and comparison with other spiralin genes. Curr. Microbiol. 35, 240-243. Fletcher, J., Wayadande, A., Melcher, U. and Ye, F. (1998) The phytopathogenic mollicute-insect vector interface: A closer look. Phytopathology 88, 1351-1358. Fraser, C.M., Gocayne, J.D., White, O., Adams, M.D., Clayton, R.A., Fleischmann, R.D., Bult, C.J., Kerlavage, A.R., Sutton, G.G., Kelley, J.M., Fritchman, J.L., Weidman, J.F., Small, K.V., Sandusky, M., Fuhrmann, J.L., Nguyen, D.T., Utterback, T., Saudek, D.M., Phillips, C.A., Merrick, J.M., Tomb, J., Dougherty, B.A., Bott, K.F., Hu, P.C., Lucier, T.S., Peterson, S.N., Smith, H.O. and Venter, J.C. (1995) The minimal gene complement of Mycoplasma genitalium. Science 270, 397-403. Gasparich, G.E. (2002) Spiroplasmas: Evolution, adaptation and diversity. Front. Biosci. 7, 619-640. Glass, J.I., Lefkowitz, E.J., Glass, J.S., Heiner, C.R., Chen, E.Y. and Cassell, G.H. (2000) The complete sequence of the muscosal pathogen Ureaplasma urealyticum. Nature 407, 757-762. Heger, A. and Holm, L. (2000) Rapid automatic detection and alignment of repeats in protein sequences. Proteins 41, 224-237.

139

Himmelreich, R., Hilbert, H., Plagens, H., Pirkl, E., Li, B.C. and Herrmann, R. (1996) Complete sequence analysis of the genome of the bacterium Mycoplasma pneumoniae. Nucleic Acids Res. 24, 4420-4449. Hofreuter, D., Odenbreit, S. and Haas, R. (2001) Natural transformation competence in Helicobacter pylori is mediated by the basic components of a type IV secretion system. Mol. Microbiol. 41, 379-391. Hruska, A.J. and Gomez Peralta, M. (1997) Maize response to corn leafhopper (Homoptera: Cicadellidae) infestation and achaparramiento disease. J. Econ. Entomol. 90, 604-610. Jones, L.J., Carballido-Lopez, R. and Errington, J. (2001) Control of cell shape in bacteria: helical, actin-like filaments in Bacillus subtilis. Cell 104, 913-922. Karplus, K., Karchin, R., Barrett, C., Tu, S., Cline, M., Diekhans, M., Grate, L., Casper, J. and Hughey, R. (2001) What is the value added by human intervention in protein structure prediction? Proteins 45, 86-91. Kwon, M.-O., Wayadande, A.C. and Fletcher, J. (1999) Spiroplasma citri movement into the intestines and salivary glands of its leafhopper vector, Circulifer tenellus. Phytopathology 89, 1144-1151. Lai, E.M. and Kado, C.I. (2000) The T-pilus of Agrobacterium tumefaciens. Trends Microbiol. 8, 361-369. Laigret, F., Carle, P., Carrere, N., Garnier, M. and Bové, J.M. (2000) 13th International Congress of International Organization of Mycoplasmologists, abstract. 48. Lee, I.M. and Davis, R.E. (1989) Serum-free media for cultivation of spiroplasma. Can. J. Microbiol. 35, 1092-1099. Li, P.L., Hwang, I., Miyagi, H., True, H. and Farrand, S.K. (1999) Essential components of the Ti plasmid trb system, a type IV macromolecular transporter. J. Bacteriol. 181, 5033-5041. Morton, T.M., Eaton, D.M., Johnston, J.L. and Archer, G.L. (1993) DNA sequence and units of transcription of the conjugative transfer gene complex (trs) of Staphylococcus aureus plasmid pG01. J. Bacteriol. 175, 4436-4447. Okinaka, R., Cloud, K., Hampton, O., Hoffmaster, A., Hill, K., Keim, P., Koehler, T., Lamke, G., Kumano, S., Manter, D., Martinez, Y., Ricke, D., Svensson, R. and Jackson, P. (1999) Sequence, assembly and analysis of pX01 and pX02. J. Appl. Microbiol. 87, 261-262. 140

Özbek, E., Miller, S.A., Meulia, T. and Hogenhout, S.A. (2003) Infection and replication sites of Spiroplasma kunkelii (Class: Mollicutes) in midgut and Malpighian tubules of the leafhopper Dalbulus maidis. J. Invertebr. Pathol. 82, 167-175. Razin, S., Yogev, D. and Naot, Y. (1998) Molecular biology and pathogenicity of mycoplasmas. Microbiol. Mol. Biol. Rev. 62, 1094-1156. Sambrook, J., Fritsch, E.F. and Maniatis, T. (1989) Molecular Cloning: a laboratory manual. Cold Spring Harbor Laboratory Press. Sonnhammer, E.L.L., von Heijine, G. and Krogh, A. (1998) A hidden Markov model for predicting transmembrane helices in protein sequences, pp. 175-182. In E. J. Glasgow, T. Littlejohn, F. Major, R. Lathrop, D. Sankoff, and C. Sensen (ed.), Proceedings of Sixth International Conference on Intelligent Systems for Molecular Biology. Menlo Park, CA: AAAI Press. Swofford, D. (2001) PAUP* 4.0. Sinauer Associates. Thompson, D.V., Melchers, L.S., Idler, K.B., Schilperoort, R.A. and Hooykaas, P.J. (1988) Analysis of the complete nucleotide sequence of the Agrobacterium tumefaciens virB operon. Nucleic Acids Res. 16, 4621-4636. Thompson, J.D., Higgins, D.G. and Gibson, T.J. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positions-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673-4680. Thompson, J.D., Gibson, T.J., Plewniak, F., Jeanmougin, F. and Higgins, D.G. (1997) The ClustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 24, 4876-4882. Wayadande, A.C., Shaw, M.E. and Fletcher, J. (1993) Tests of differential transmission of three Spiroplasma citri lines by the leafhopper, Circulifer tenellus. Phytopathology 83, 468. Wright, F. (1990) The 'effective number of codons' used in a gene. Gene 87, 23-29. Ye, F., Melcher, U., Rascoe, J.E. and Fletcher, J. (1996) Extensive chromosome aberrations in Spiroplasma citri strain BR3. Biochem. Genet. 34, 269-285. Zatyka, M. and Thomas, C.M. (1998) Control of genes for conjugative transfer of plasmids and other mobile elements. FEMS Microbiol. Rev. 21, 291-319.

141

Zhao, Y., Hammond, R.W., Jomantiene, R., Dally, E.L., Lee, I.-M., Jia, H., Wu, H., Lin, S., Zhang, Z., Kenton, S., Najar, F.Z., Hua, A., Roe, B.A., Fletcher, J. and Davis, R.E. (2003) Gene content and organization of an 85-kb DNA segment from the genome of the phytopathogenic mollicute Spiroplasma kunkelii. Mol. Gen. Genomics 269, 592-602.

142

Gene ID

Length (nt)

GC content

Nc valuea

E value of Blastx best hitsb

traE1

2,697

23%

34.98

traE2

2,664

26%

traE3

2,520

traE4

2,520

TraE amino acid sequence similarityc TraE1

TraE2

TraE3

TraE4

2e-97

-

54%

48%

49%

33.40

3e-89

54%

-

43%

43%

28%

40.85

7e-57

48%

43%

-

96%

28%

40.08

7e-63

49%

43%

96%

-

Table 4.1 Basic features of traE genes in S. kunkelii M2 strain.

a

Predicted by Chips in EMBOSS. Nc value is an index of codon usage ranging from 20 (the strongest codon bias) to 61 (the lowest codon bias). b Performed by local BLAST package against local NCBI nr database. The best hits of all four genes were TrsE-like protein of M. pulmonis (NP_326214). c Deduced TraE amino acid sequence similarity is the percentage of positives predicted by Pairwise BLAST program using BLOSUM62 matrix on NCBI web site.

143

Fig. 4.1 ClustalW alignment of the deduced protein sequences of traE1, traE2, traE3 and traE4 of S. kunkelii strain M2. Three partially conserved transmembrane domains are marked, the ATPase domains were underlined, and the conserved DEAH sequences are boxed.

144

Fig. 4.2 Phylogenetic analyses of the TraE protein sequences from S. kunkelii M2 strain and other organisms. Numbers at the nodes indicate percentage recovery of these nodes per 1,000 bootstrap replicates. The proteins and accession numbers are: Bacillus anthracis pX02.09 (NP_053164); Bifidobacterium longum DJ010A HP (hypothetical protein) (ZP_00121677); Clostridium acetobutlicus TrsE (NP_348666); Clostridium perfringens CHP (conserved hypothetical protein) (NP_150039); Enterococcus faecalis TrsE (AAF72347); Enterococcus faecalis HP (CAC29183); Enterococcus faecium HP (ZP_00037379); Lactococcus lactis TrsE (NP_047296); Leuconostoc mesenteroides HP (ZP_00062942); Mycoplasma pulmonis trsE (NP_326214); Mycoplasma fermentans TraE (AAN85227); Plasmid pIP501 ORF5 (AAA99470); Plasmid R100 TraC (NP_052960); Proteus vulgaris PBP (pilus biogenesis protein) (NP_640173); Providencia rettgeri TraC (AAM08001); Samonella typhimurium LT2 CT (conjugative transfer) protein (NP_490573); Streptococcus agalactiae 2603VR Orf26 (NP_688287); Streptococcus agalactiae NEM316 unknown (NP_735797); Streptococcus pneumoniae orf26 (AAG38042); Staphylococcus aureus TrsE (F36891); Sulfolobus tokodaii CHP (NP_377258); Sulfolobus sp. HP (T31035); Thermoanaerobacter tengcongensis CHP (NP_623664); Vibrio cholerae SPA (involved in sex pilus assembly) (AAL59681). Escherichia coli TraC (AAB61935), served as the outgroup.

145

Fig. 4.3 Detection of traE sequences by Southern blot hybridization of digested genomic DNA of S. kunkelii strains M2, CS-2B, FL-80 and PU8-17. (A) Schematic graphs of S. kunkelii M2 traE ORFs (open arrows) depicting the recognition sites of restriction enzymes used in genomic DNA digestions and the positions of primers (closed arrows above traE ORFs) used for probe synthesis. (B) Southern blots hybridizations. Genomic DNAs of S. kunkelii strains M2 (lanes 1), CS-2B (lanes 2), FL-80 (lanes 3), PU817 (lanes 4) and S. citri (lanes 5) were digested with restriction enzymes and hybridized to DIG-labeled probes as indicated in A. The sizes of the DIG-labeled SPP1 DNA EcoRI markers (lanes M) are indicated in base pairs at the left of panel traE1. The traE3 and traE4 restriction fragments are indicated with arrows and fragments lengths in base pairs (bp) at the right of the panel traE3/4.

146

Fig. 4.4 Detection of traE sequences on chromosomal and plasmid DNA of S. kunkelii strains M2, CS-2B, and PU8-17. (A) EB-stained pulsed field gel of genomic DNA. (B) Southern blots of pulsed field gels hybridized to traE1, traE2 and traE3/4 probes. (C) Southern blots of agarose gels containing S. kunkelii DNA isolated with Qiagen Midiprep kit (Qiagen, Inc.) hybridized to traE1, traE2, and traE3/4 probes. Saccharomyces cerevisiae (baker’s yeast) genome was used as marker. Genomic DNA was extracted from S. kunkelii M2 strain (lane 1), CS-2B strain (lane 2) and PU8-17 strain (lane 3) and hybridized with probes generated by [α-32P]-dCTP labeling of PCR products (Fig. 4.3A).

147

Fig. 4.5 Detection of S. kunkelii spiralin gene and traE transcripts on Northern blots of size-separated total RNA samples. (A) Northern blot probed with traE2. (B) Northern blot probed with traE3. Probes were generated by [α-32P]-dCTP labeling of PCR products of the full-length PCR product of the S. kunkelii M2 spiralin gene, and primers of traE2 forward1 and reverse1 and traE3 forward1 and reverse1. Detection of transcripts of the constitutively expressed spiralin gene was used as RNA quality and loading control, and to compare traE expression levels. Total RNA was extracted from healthy leafhoppers (lane 1), leafhoppers infected with S. kunkelii M2 strain (lane 2), S. kunkelii M2 culture (lane 3), plants infected with S. kunkelii M2 strain (lane 4), and healthy plants (lane 5). Estimated sizes of bands of in A and B are marked at the left side of blot A.

148

Fig. 4.6 Genetic contexts of traE genes in the genome of S. kunkelii CR2-3x. ORFs were indicated with open arrows and identities of ORFs with significant hits to genes in GenBank were illustrated above the ORFs. Genes that encoded proteins with transmembrane regions are marked with an asterisk underneath the ORF.

149

CHAPTER 5 Comparative genomics identifies genes shared by distantly related insect-transmitted plant pathogenic mollicutes

Xiaodong Bai1, Jianhua Zhang1, Ian R. Holford2 and Saskia A. Hogenhout1

1

Department of Entomology, 2Molecular and Cellular Image Center (MCIC), The Ohio State University - OARDC, Wooster, OH 44691, U.S.A.

150

5.1 Abstract Phytoplasmas and spiroplasmas are distantly related insect-transmitted plant pathogens within the class Mollicutes. Genome sequencing projects of phytoplasma strain Aster Yellow Witches’ Broom (AY-WB) and Spiroplasma kunkelii are near completion. Complete genome sequences of seven obligate animal and human pathogenic mollicutes (Mycoplasma and Ureaplasma spp.), and OY phytoplasma have been reported. Putative ORFs predicted from the genome sequences of AY-WB and S. kunkelii were compared to those of the completed genomes. This resulted in identification of at least three ORFs present in AY-WB, OY and S. kunkelii but not in the obligate animal and human pathogenic mollicutes. Moreover, we identified ORFs that seemed more closely related between AY-WB and S. kunkelii than to their mycoplasma counterparts. Phylogenetic analyses using parsimony were employed to study the origin of these genes, resulting in identification of one gene that may have undergone horizontal gene transfer. The possible involvement of these genes in plant pathogenicity is discussed.

151

5.2 Introduction Mollicutes, characterized by small genomes and no cell wall, are believed to have diverged from a Gram-positive bacterial ancestor in the lactobacillus group (Woese, 1997; Razin et al., 1998). Within the class Mollicutes, an early evolutionary split occurred between the AAA (Asteroleplasma, Anaeroplasma, and Acholeplasma) branch and the SEM (Spiroplasma, Entomoplasma, and Mycoplasma) branch, both of which independently underwent genome reductions (Razin et al., 1998). Apparently, the conversion of UGA from a stop codon to a tryptophan codon in the SEM branch occurred shortly after the split of the two branches. The SEM branch contains several genera, including Spiroplasma, Entomoplasma, Mesoplasma, Mycoplasma, and Ureaplasma. Spiroplasmas are believed to be evolutionary early mollicutes and did not undergo as many gene loss events as members of other genera (Razin et al., 1998). At the start of the genomic era, mollicutes have attracted much attention because of their small genomes and their clinical and agricultural impact. Six mollicutes genomes, five Mycoplasma spp. and one Ureaplasma sp., have been fully sequenced, representing obligate human and mammal pathogens of the genus Mycoplasma of the SEM branch. At the time of preparation of this manuscript, genome sequencing projects of three other mycoplasmas are in progress: the rodent polyarthritis pathogen Mycoplasma arthritidis, the contagious caprine pleuropneumonia (CCPP) pathogen M. capricolum, and the contagious bovine pleuropneumonia (CBPP) pathogen M. mycoides subsp. mycoides SC (small colony). At the time of the revision of this manuscript, the complete genome of M. mycoides subsp. mycoides SC, the causative agent of CBPP, was published (Westberg et

152

al., 2004). Further, the Onion Yellows (OY) phytoplasma genome was completely sequenced (Oshima et al., 2004). Genome sequencing projects are in progress for Spiroplasma kunkelii (http://www.genome.ou.edu/spiro.html) and Aster Yellows Witches’-Broom (AY-WB, http://www.oardc.ohio-state.edu/phytoplasma). S. kunkelii and phytoplasmas are insecttransmitted plant pathogens that replicate in both insect vectors and plant hosts. Interestingly, S. kunkelii and phytoplasmas are strikingly similar in their infection patterns of insects and plants. Both are restricted to phloem tissues of plant hosts, from which they are acquired by phloem-feeding insects, and subsequently invade and replicate in the cells of insect gut and other tissues. Interestingly, although Spiroplasma species and all phytoplasmas described so far share similar infection patterns and environmental niches, they are distantly related within two branches of the class Mollicutes. Based on phylogenies of 16S rDNA and tuf genes, membrane composition, codon usage and metabolism (Razin et al., 1998), spiroplasmas were grouped in the SEM branch with Mycoplasma and Ureaplasma spp. while phytoplasmas were grouped in the AAA branch with Acholeplasma spp. This study was initiated based on the hypothesis that genes shared by evolutionarily divergent insect-transmitted plant pathogens but absent from obligate human and animal pathogens are likely important for insect transmission and/or plant pathogenicity. Using computer-assisted analysis, we have identified at least three open reading frames (ORFs) that were present in S. kunkelii and AY-WB but absent from

153

mycoplasmas. We have also identified ORFs that do not match the 16S rDNA and tuf phylogenies. The involvement of the ORFs in pathogenicity is discussed.

5.3 Materials and Methods 5.3.1 Genome sequences The 16 contigs totaling 695 kb of the estimated 800 kb AY-WB genome were obtained from the phytoplasma genome sequencing project website (http://www.oardc.ohio-state.edu/phytoplasma). The 46 contigs totaling 1.5 Mb of the estimated 1.6 Mb S. kunkelii CR2-3x genome were obtained from the publicly accessible S. kunkelii genome sequencing project website (http://www.genome.ou.edu/spiro.html). Complete mycoplasma genome sequences were obtained from GenBank, including Mycoplasma genitalium (NC_000908) (Fraser et al., 1005), M. pneumoniae (NC_000912) (Himmelreich et al., 1996), U. urealyticum (NC_002162) (Glass et al., 2000), M. pulmonis (NC_002771) (Chambaud et al., 2001), M. penetrans (NC_004432) (Sasaki et al., 2002), and M. gallisepticum (NC_004829) (Papazisi et al., 2003).

5.3.2 Comparative genome analysis Genome comparisons were conducted as illustrated in Fig. 5.1. Genome sequences were downloaded onto a Linux workstation and used as input files for the ORF extractor program (http://www.oardc.ohio-state.edu/mcic/bioinformatics/bio_ software/bio_ software.html#ORF). ORFs were defined as starting with ATG and ending with in-frame TAG, TAA, or TGA for AY-WB, or TAG and TAA for S. kunkelii and all

154

Mycoplasma and Ureaplasma spp. (Razin et al., 1998). ORFs longer than 90 bp were extracted in FASTA format. Subsequently, only the longest ORF within a set of ORFs having stop codons at the same positions was extracted. ORFs in nucleotide sequences were translated into amino acid (aa) sequences using a Perl translation program, using translation table 11 (bacterial code) for AY-WB and translation table 4 (Mold mitochondria code) for all others (Benson et al., 2003; Wheeler et al., 2003). This generated datasets AYdb for AY-WB, Skdb for S. kunkelii, and mycoprotdb for the five Mycoplasma spp. and U. urealyticum. Subsequently, AYdb and Skdb were compared using stand-alone BLAST (Basic Local Alignment Search Tool) package (Altschul et al., 1997) with the expectation (E) value threshold of 10-8. Proteins having significant similarity (E < 10-8) were extracted from AYdb to generate AY_Skdb and from Skdb to generate Sk_AYdb. Subsequently, AY_Skdb and Sk_AYdb were compared to mycoprotdb and proteins with non-significant hits (E > 10-8) or no hits were extracted from AY_Skdb to generate AY_Sk-mycoprotdb and from Sk_AYdb to generate Sk_AYmycoprotdb. Proteins within AY_Sk-mycoprotdb and Sk_AY-mycoprotdb were annotated based on sequence similarity searches against NCBI nr database and compared manually to identify common protein sequences. Identified proteins were validated by manual comparison with the annotated genomes of mycoplasmas and ureaplasma. After the finish of this study, the genomes of OY phytoplasma (Oshima et al., 2004) and M. mycoides subsp. mycoides SC strain (Westberg et al., 2004) were sequenced. The identified proteins were searched against the annotated proteins of these organisms using the BLAST algorithm (Altschul et al., 1997).

155

Negative logistic plots of best E values for each query were generated for searches of (i) AYdb against Skdb and mycoprotdb and (ii) Skdb against AYdb and mycoprotdb. For comparable quantitative assessment, an E value of 0.0 was set to 10-200, and proteins with no significant hits were assigned E values of 1000.

5.3.3 Phylogenetic analysis Protein sequences for phylogenetic analysis were extracted from NCBI Entrez database. Sequence alignments were produced using ClustalW (Thompson et al., 1994) and used as inputs for phylogenetic analysis using PAUP (Phylogenetic Analysis Using Parsimony) program (Swofford, 2001).

5.3.4 Accession numbers AY-WB amino acid sequences identified in this study were deposited in GenBank with the accession numbers as follows: AAA type ATPase (AtA), AY533109; cmp binding factor (CBF), AY533110; cytosine deaminase, AY533111; hypothetical protein, AY533112; cation transport P-ATPase, AY533113; polynucleotide phosphorylase (PNPase), AY533114; ppGpp synthetase, AY533115; YlxR protein, AY533116.

5.4 Results 5.4.1 Extraction of ORFs The method employed for ORF extraction resulted in more ORFs than currently annotated in the genomes. For either S. kunkelii (1.5 Mb) or AY-WB (700 kb), the

156

number of predicted ORFs was more than that estimated with the average ORF size of 1 kb (Casjens, 1998). No genome annotation is perfect. For example, re-annotation of Mycoplasma pneumoniae resulted in more ORFs and function annotations (Dandekar et al., 2000). We expect that we have included in our study most of the annotated ORFs (complete or partial) in the GenBank database by using the ORF extractor program, since most ORFs that start with alternative start codons have an in-frame ATG somewhere in the ORF. We have verified this assessment. Of the 4,332 mycoplasma and ureaplasma ORFs downloaded from GenBank, 991 ORFs (22.7%) start with an alternative start codon. Of the ORFs starting with an alternative start codon, 984 ORFs (99.3 %) contained an in-frame ATG somewhere in the ORF. Thus, only 0.7 % of the putative ORFs starting with alternative start codons present in the GenBank database have been excluded from the ORF extractor database. This is only 0.2 % of all the 4,332 annotated mycoplasma and ureaplasma (i.e. members of M. pneumoniae and M. hominis groups and Ureaplasma urealyticum) ORFs present in GenBank. To minimize the number of false-positives produced by the method, only the longest ORF within a set of ORFs having stop codons at the same position was extracted for subsequent analysis. Translation of the ORFs into amino acid sequences generated AYdb for AY-WB, Skdb for S. kunkelii, and mycoprotdb for the five Mycoplasma species and U. urealyticum.

157

5.4.2 Identification of four proteins that are present in AY-WB and S. kunkelii but absent from mycoplasmas Amino acid sequence similarity searches were employed to identify proteins shared between AY-WB and S. kunkelii. AYdb and Skdb were searched against each other using stand-alone BLAST package (Altschul et al., 1997). 290 proteins within AYdb had significant similarity (E-value < 10-8) to proteins within Skdb, whereas 260 proteins within Skdb had significant similarity to proteins within AYdb. E (expectation) value in BLAST search is defined as “the number of different alignments with scores equivalent to or better than S that are expected to occur in a database search by chance”, and it depends on the size of the search database and the scoring system (Altschul et al., 1994). Thus, it was expected that the number of proteins with significant similarity for the two independent searches would differ because of different database sizes. To identify shared AY-WB and S. kunkelii proteins that are not present in animal and human pathogenic mycoplasmas and ureaplasmas, AY_Skdb and Sk_AYdb were searched against mycoprotdb using the blastp algorithm. Sequences that had nonsignificant similarity (E-value > 10-8) or no similarities were extracted from Skdb and AYdb. This resulted in two datasets of AY_Sk_-mycoprotdb with 14 entries and Sk_AY_-mycoprotdb with 7 entries. Plotting the negative logs of the blastp E-values showed that the majority of the predicted protein sequences shared by AY-WB and S. kunkelii had homologs in the five Mycoplasma spp. and U. urealyticum. However, 9 AY-WB and 8 S. kunkelii proteins did not have significant similarity to proteins in mycoprotdb (Fig. 5.2). Among these, four

158

proteins present in both the AY_Sk_-mycoprotdb and Sk_AY_-mycoprotdb datasets were analyzed because they are similar in length and have significant similarity to proteins in NCBI nr database (closed diamonds, Fig. 5.2). The four proteins were identified as polynucleotide phosphorylase (PNPase), cmp-binding factor (CBF), cytosine deaminase, and YlxR protein (Table 5.1). The PNPase protein sequences of AY-WB and S. kunkelii were 62% (452/719) similar, the CBFs 59% (138/231), cytosine deaminases 60% (86/141), and YlxR proteins 61% (46/74). To ensure that the sequences are not present in the genomes of mycoplasmas and ureaplasmas, sequences in common between AY-WB and S. kunkelii were searched against the mycoplasma and ureaplasma GenBank databases. Further, the annotated protein databases of Mycoplasma spp. and U. urealyticum were searched by keywords. Both analyses showed that no proteins for these organisms were annotated as PNPase, CBF, cytosine deaminase, or YlxR protein. Thus, these data suggested that these four genes are present in AY-WB and S. kunkelii but absent from Mycoplasma spp. and U. urealyticum genomes. All these four proteins have homologs in OY phytoplasma genome. However, all but PNPase have homologs in M. mycoides subsp. mycoides SC strain.

5.4.3 Identification of proteins more closely related between AY-WB and S. kunkelii Four proteins were identified from the negative logistic plots that were more similar between AY-WB and S. kunkelii than to mycoplasmas (open circles, Fig. 5.2). These proteins were identified as ppGpp synthetase, HAD hydrolase, AtA (AAA type ATPase), and P-type Mg2+ transport ATPase (Table 5.2). Amino acid sequence

159

similarities between AY-WB proteins and S. kunkelii proteins were ppGpp synthetase, 59% (305/503); HAD hydrolase, 59% (449/750); AtA, 88% (362/407); and P-type Mg2+ transport ATPase, 56% (512/902). All proteins have homologs in the genomes of OY phytoplasma and M. mycoides subsp. mycoides SC strain, except for the AtA sequence that is lacking from OY phytoplasma.

5.4.4 Phylogenetic analysis of proteins present in AY-WB and S. kunkelii but absent from mycoplasmas Phylogenetic analyses were performed to investigate the origin of the proteins identified in this study. The PNPases from AY-WB and S. kunkelii clustered with those from the Gram-positive Bacillus and Streptococcus spp. and were clearly distinct from those of Gram-negative bacteria (Fig. 5.3B). Thus, the PNPase phylogenetic trees are consistent with the proposed evolutionary status of mollicutes as descendents of Grampositive bacterial ancestors (Weisburg et al., 1989; Woese, 1989). Phylogenetic analysis of CBFs (Fig. 5.3C) resulted in a tree different from the phylogenetic tree based on 16S rDNA sequences (Fig. 5.3A) with the CBF sequences of AY-WB and S. kunkelii separated by CBF sequences of Gram-positive bacteria. Phylogenetic analyses of cytosine deaminases and YlxR proteins resulted in trees with most branches having low bootstrap values (data not shown).

160

5.4.5 Phylogenetic analysis of proteins more closely related between AY-WB and S. kunkelii Phylogenetic analysis was employed to analyze the possible origins of the four proteins that were more closely related between AY-WB and S. kunkelii than to mycoplasmas. Most branches of the phylogenetic trees generated using ppGpp synthetase, HAD hydrolase, and P-type Mg2+ transport ATPase had bootstrap values lower than 50% (data not shown). However, bootstrap values of the phylogenetic tree based on AtA sequences were statistically significant. Interestingly, in the AtA phylogeny, the phytoplasma AtA sequence clustered together with the AtA sequence of S. kunkelii in a cluster of AtA sequences of other mycoplasmas belonging to the SEM branch (Fig. 5.4), which is different from the 16S rDNA phylogeny (Fig. 5.3A). The AtA homolog is present in M. mycoides subsp. mycoides SC, which is also a member of the SEM branch of mollicutes, but it is absent from the OY phytoplasma genome.

5.5 Discussion In this study, we have identified several proteins that appear to be present in AYWB and S. kunkelii but absent from Mycoplasma spp. and U. urealyticum. These proteins are PNPase, CBF, cytosine deaminase, and YlxR. These proteins are also present in the genome of OY phytoplasma, another insect-transmitted plant pathogenic mollicute closely related to AY-WB. PNPase is an exoribonuclease belonging to the PDX family that also includes RNase PH (Zuo and Deutscher, 2001). Most prokaryotes have PNPase homologs,

161

however, thus far, none have been sequenced from mycoplasmas and Ureaplasma urealyticum. PNPase genes are also present in the genomes of plants (Li et al., 1998) and Drosophila (Adams et al., 2000). PNPases are highly conserved proteins that are involved in mRNA degradation and regulation of gene expression (Carpousis, 2002). PNPase has been shown to be a global regulator of virulence factors of Salmonella enterica, because a single point mutation of the PNPase gene resulted in a significant decrease in efficiency of invasion and intracellular replication of this bacterium (Clements et al., 2002). Both AY-WB and S. kunkelii invade and replicate cells of insects and plants (Özbek et al., 2003) and, consequently, have to adjust their gene expression patterns continuously to different environments. In contrast, the Mycoplasma and Ureaplasma spp. are restricted to animal hosts in which they are able to attach to and most invade epithelial cell layers but do not appear to spread systemically throughout their hosts (Razin et al., 1998). Thus, PNPases in plant pathogenic bacteria, AY-WB, S. kunkelii, and OY phytoplasma, could be important for gene expression regulation allowing adaptation to multiple environmental niches, including insect gut lumen, insect cells, plant phloem, and plant cells. However, the involvement of PNPase in regulation of virulence of plant pathogenic mollicutes, AY-WB and S. kunkelii, remains to be investigated. At this time, spiroplasmas are more suitable candidates for such an investigation, because, unlike phytoplasmas, they can be cultured (Saglio et al., 1973) and transformed (Lartigue et al., 2002). CBF is a protein identified in Staphylococcus aureus. It binds to the cmp sequence, a replication enhancer identified in the pT181 plasmid of S. aureus, to

162

stimulate plasmid replication (Zhang et al., 1997). Spiroplasmas and phytoplasmas have plasmids (Razin et al., 1998), whereas plasmids have not been reported in members of M. pneumoniae and M. hominis groups and Ureaplasma urealyticum that do not have CBF. Interestingly, a CBF homolog is present within the recently released complete genome of M. mycoides subsp. mycoides SC strain (Westberg et al., 2004). Although plasmids have not been reported in the SC type strain, plasmids are common in Mycoplasma mycoides spp. mycoides (King and Dybvig, 1994; Djordjevic et al., 2001). It is possible that CBF is required for regulation of plasmid replication in spiroplasmas and phytoplasmas. Interestingly, spiroplasma and phytoplasma plasmid appear to harbor virulence factors (Melcher et al., 1999; Oshima et al., 2002). Cytosine deaminase is an enzyme involved in nucleotide metabolism and can affect protein synthesis if transiently expressed in human cells (Kreuzer et al., 1996). Thus, apparently, S. kunkelii, AY-WB and OY have an additional housekeeping gene that is absent from other mollicutes sequenced so far. YlxR protein is expressed from the nusA/infB operon in bacteria and proposed to be an RNA-binding protein (Osipiuk et al., 2001). We also identified four AY-WB and S. kunkelii ORFs that appear to be more closely related to each other than their mycoplasma counterparts. Of these, the AtA sequence is most interesting, because the phylogenetic tree suggests that phytoplasmas might have obtained the AtA sequence from spiroplasmas, possibly S. kunkelii, by horizontal gene transfer. This hypothesis is supported by additional data. First, AtA is absent from the OY phytoplasma genome (Oshima et al., 2004). OY phytoplasmas is a

163

plant pathogen in Japan where there is no occurrence of S. kunkelii. But, in the American continent, S. kunkelii and AY-WB co-occur and share similar insect and plant host ranges. Secondly, AtA sequences of both AY-WB and S. kunkelii are flanked by insertion sequences that often part of mobile elements (Mahillon and Chandler, 1998). AY-WB AtA is flanked by a truncated transposase gene at its 5’ end and an intact transposase gene at its 3’ end, and S. kunkelii AtA is located in an IS (insertion sequence) elementrich region. In summary, the comparative genomics study presented herein successfully identified proteins that are common among insect-transmitted plant pathogenic mollicutes. Further studies of these proteins may elucidate their roles in insect transmission and plant pathogenicity.

5.6 Acknowledgments The authors wish to thank Dr. Sophien Kamoun in the Department of Plant Pathology, OSU-OARDC, for constructive advice; Dr. Tea Meulia for setup of the Linux workstation and design of ORF Extractor program; and B. A. Roe, S. P. Lin, H.G. Jia, H.M. Wu, D. Kupfer, and R. E. Davis and the Spiroplasma kunkelii Genome Sequencing Project funded by US Department of Agriculture, Agricultural Research Service Project Number: 1275-22000-144-02 for the S. kunkelii genome sequences. This research was supported by OSU-OARDC Research Enhancement Competitive Grants Program and MCIC.

164

5.7 References Adams, M.D., Celniker, S.E., Holt, R.A., Evans, C.A., Gocayne, J.D., Amanatides, P.G., Scherer, S.E., Li, P.W., Hoskins, R.A., Galle, R.F., George, R.A., Lewis, S.E., Richards, S., Ashburner, M., Henderson, S.N., Sutton, G.G., Wortman, J.R., Yandell, M.D., Zhang, Q., Chen, L.X., Brandon, R.C., Rogers, Y.H., Blazej, R.G., Champe, M., Pfeiffer, B.D., Wan, K.H., Doyle, C., Baxter, E.G., Helt, G., Nelson, C.R., Gabor, G.L., Abril, J.F., Agbayani, A., An, H.J., Andrews-Pfannkoch, C., Baldwin, D., Ballew, R.M., Basu, A., Baxendale, J., Bayraktaroglu, L., Beasley, E.M., Beeson, K.Y., Benos, P.V., Berman, B.P., Bhandari, D., Bolshakov, S., Borkova, D., Botchan, M. R., Bouck, J., Brokstein, P., Brottier, P., Burtis, K.C., Busam, D.A., Butler, H., Cadieu, E., Center, A., Chandra, I., Cherry, J.M., Cawley, S., Dahlke, C., Davenport, L.B., Davies, P., de Pablos, B., Delcher, A., Deng, Z., Mays, A.D., Dew, I., Dietz, S.M., Dodson, K., Doup, L.E., Downes, M., Dugan-Rocha, S., Dunkov, B.C., Dunn, P., Durbin, K.J., Evangelista, C.C., Ferraz, C., Ferriera, S., Fleischmann, W., Fosler, C., Gabrielian, A.E., Garg, N.S., Gelbart, W.M., Glasser, K., Glodek, A., Gong, F., Gorrell, J.H., Gu, Z., Guan, P., Harris, M., Harris, N. L., Harvey, D., Heiman, T.J., Hernandez, J.R., Houck, J., Hostin, D., Houston, K.A., Howland, T.J., Wei, M.H., Ibegwam, C., Jalali, M., Kalush, F., Karpen, G.H., Ke, Z., Kennison, J.A., Ketchum, K.A., Kimmel, B.E., Kodira, C.D., Kraft, C., Kravitz, S., Kulp, D., Lai, Z., Lasko, P., Lei, Y., Levitsky, A.A., Li, J., Li, Z., Liang, Y., Lin, X., Liu, X., Mattei, B., McIntosh, T.C., McLeod, M.P., McPherson, D., Merkulov, G., Milshina, N.V., Mobarry, C., Morris, J., Moshrefi, A., Mount, S.M., Moy, M., Murphy, B., Murphy, L., Muzny, D.M., Nelson, D.L., Nelson, D.R., Nelson, K.A., Nixon, K., Nusskern, D.R., Pacleb, J.M., Palazzolo, M., Pittman, G.S., Pan, S., Pollard, J., Puri, V., Reese, M.G., Reinert, K., Remington, K., Saunders, R.D., Scheeler, F., Shen, H., Shue, B.C., Siden-Kiamos, I., Simpson, M., Skupski, M.P., Smith, T., Spier, E., Spradling, A.C., Stapleton, M., Strong, R., Sun, E., Svirskas, R., Tector, C., Turner, R., Venter, E., Wang, A.H., Wang, X., Wang, Z.Y., Wassarman, D.A., Weinstock, G.M., Weissenbach, J., Williams, S.M., Woodage, T., Worley, K.C., Wu, D., Yang, S., Yao, Q.A., Ye, J., Yeh, R.F., Zaveri, J.S., Zhan, M., Zhang, G., Zhao, Q., Zheng, L., Zheng, X.H., Zhong, F.N., Zhong, W., Zhou, X., Zhu, S., Zhu, X., Smith, H.O., Gibbs, R.A., Myers, E.W., Rubin, G.M. and Venter, J.C. (2000) The genome sequence of Drosophila melanogaster. Science 287, 2185-2195. Altschul, S.F., Boguski, M.S., Gish, W. and Wootton, J.C. (1994) Issues in searching molecular sequence databases. Nat. Genet. 6, 119-129. Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W. and Lipman, D.J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389-3402. Benson, D.A., Karsch-Mizrachi, I., Lipman, D.J., Ostell, J. and Wheeler, D.L. (2003) GenBank. Nucleic Acids Res. 31, 23-27. 165

Carpousis, A.J. (2002) The Escherichia coli RNA degradosome: structure, function and relationship in other ribonucleolytic multienzyme complexes. Biochem. Soc. Trans. 30, 150-155. Casjens, S. (1998) The diverse and dynamic structure of bacterial genomes. Annu. Rev. Genet. 32, 339-377. Chambaud, I., Heilig, R., Ferris, S., Barbe, V., Samson, D., Galisson, F., Moszer, I., Dybvig, K., Wroblewski, H., Viari, A., Rocha, E.P.C. and Blanchard, A. (2001) The complete genome sequence of the murine respiratory pathogen Mycoplasma pulmonis. Nucleic Acids Res. 29, 2145-2153. Clements, M.O., Eriksson, S., Thompson, A., Lucchini, S., Hinton, J.C., Normark, S. and Rhen, M. (2002) Polynucleotide phosphorylase is a global regulator of virulence and persistency in Salmonella enterica. Proc. Natl. Acad. Sci. USA 99, 8784-8789. Dandekar, T., Huynen, M., Regula, J.T. Ueberle, B., Zimmermann, C.U., Andrade, M.A., Doerks,T., Sanchez-Pulido, L., Snel, B., Suyama, M., Yuan, Y.P., Herrmann, R. and Bork, P. (2000) Re-annotating the Mycoplasma pneumoniae genome sequence: adding value, function and reading frames. Nucleic Acids Res. 28, 3278-3288. Djordjevic, S.R., Forbes, W.A., Forbes-Faulkner, J., Kuhnert, P., Hum, S, Hornitzky, M.A., Vilei, E.M. and Frey, J. (2001) Genetic diversity among Mycoplasma species bovine group 7: clonal isolates from an outbreak of polyarthritis, mastitis, and abortion in dairy cattle. Electrophoresis 22, 3551-3561. Fraser, C.M., Gocayne, J.D., White, O., Adams, M.D., Clayton, R.A., Fleischmann, R.D., Bult, C.J., Kerlavage, A.R., Sutton, G.G., Kelley, J.M., Fritchman, J.L., Weidman, J.F., Small, K.V., Sandusky, M., Fuhrmann, J.L., Nguyen, D.T., Utterback, T., Saudek, D.M., Phillips, C.A., Merrick, J.M., Tomb, J., Dougherty, B.A., Bott, K.F., Hu, P.C., Lucier, T.S., Peterson, S.N., Smith, H.O. and Venter, J.C. (1995) The minimal gene complement of Mycoplasma genitalium. Science 270, 397-403. Glass, J.I., Lefkowitz, E.J., Glass, J.S., Heiner, C.R., Chen, E.Y. and Cassell, G.H. (2000) The complete sequence of the mucosal pathogen Ureaplasma urealyticum. Nature 407, 757-762. Himmelreich, R., Hilbert, H., Plagens, H., Pirkl, E., Li, B.C. and Herrmann, R. (1996) Complete sequence analysis of the genome of the bacterium Mycoplasma pneumoniae. Nucleic Acids Res. 24, 4420-4449. King, K.W. and Dybvig, K. (1994) Mycoplasmal cloning vectors derived from plasmid pKMK1. Plasmid 31, 49-59. 166

Kreuzer, J., Denger, S., Reifers, F., Beisel, C., Haack, K., Gebert, J. and Kubler, W. (1996) Adenovirus-assisted lipofection: efficient in vitro gene transfer of luciferase and cytosine deaminase to human smooth muscle cells. Atherosclerosis 124, 49-60. Lartigue, C., Duret, S., Garnier, M. and Renaudin, J. (2002) New plasmid vectors for specific gene targeting in Spiroplasma citri. Plasmid 48, 149-159. Li, Q.S., Gupta, J.D. and Hunt, A.G. (1998) Polynucleotide phosphorylase is a component of a novel plant poly(A) polymerase. J. Biol. Chem. 273, 17539-17543. Mahillon, J. and Chandler, M. (1998) Insertion sequences. Microbiol. Mol. Biol. Rev. 62, 725-774. Melcher, U., Sha, Y., Ye, F. and Fletcher, J. (1999) Mechanisms of spiroplasma genome variation associated with SpV1-like viral DNA inferred from sequence comparisons. Microbiol. Comp. Genomics 4, 29-46. Nakai, K. and Horton, P. (1999) pSORT: a program for detecting sorting signals in proteins and predicting their subcellular localization. Trends Biochem. Sci. 24, 34-36. Oshima, K., Miyata, S., Sawayanagi, T., Kakizawa, S., Nishigawa, H., Jung, H.-Y., Furuki, K., Yanazaki, M., Suzuki, S., Wei, W., Kuboyama, T., Ugaki, M. and Namba, S. (2002) Minimal set of metabolic pathways suggested from the genome of onion yellows phytoplasma. J. Gen. Plant Pathol. 68, 225-236. Oshima, K., Kakizawa, S., Nishigawa, H., Jung, H.-Y., Wei, W., Suzuki, S., Arashida, R., Nakata, D., Miyata, S., Ugaki, M. and Namba, S. (2004) Reductive evolution suggested from the complete genome sequence of a plant-pathogenic phytoplasma. Nat. Genet. 36, 27-29. Osipiuk, J., Gornicki, P., Maj, L., Dementieva, I., Laskowski, R. and Joachimiak, A. (2001) Streptococcus pneumonia YlxR at 1.35 A shows a putative new fold. Acta Crystallogr. D Biol. Crystallogr. 57, 1747-1751. Özbek, E., Miller, S.A., Meulia, T. and Hogenhout, S.A. (2003) Infection and replication sites of Spiroplasma kunkelii (Class: Mollicutes) in midgut and Malpighian tubules of the leafhopper Dalbulus maidis. J. Invertebr. Pathol. 82, 167-175. Papazisi, L., Gorton, T.S., Kutish, G., Markham, P.F., Browning, G.F., Nguyen, D.K., Swartzell, S., Madan, A., Mahairas, G. and Geary, S.J. (2003) The complete genome sequence of the avian pathogen Mycoplasma gallisepticum strain R(low). Microbiology (Reading, Engl.) 149, 2307-2316.

167

Razin, S., Yogev, D. and Naot, Y. (1998) Molecular biology and pathogenicity of mycoplasmas. Microbiol. Mol. Biol. Rev. 62, 1094-1156. Saglio, P., L’hospital, M., Lafleche, D., Dupont, G., Bove, J.M., Tully, J.G. and Freundt, E.A. (1973) Spiroplasma citri gen. and sp. n.: a mycoplasma-like organism associated with ‘stubborn’ disease of citrus. Int. J. Syst. Bacteriol. 23, 191-204. Sasaki, Y., Ishikawa, J., Yamashita, A., Oshima, K., Kenri, T., Furuya, K., Yoshino, C., Horino, A., Shiba, T., Sasaki, T. and Hattori, M. (2002) The complete genomic sequence of Mycoplasma penetrans, an intracellular bacterial pathogen in humans. Nucleic Acids Res. 30, 5293-5300. Swofford, D. (2001) PAUP* 4.0. Sinauer Associates. Thompson, J.D., Higgins, D.G. and Gibson, T.J. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positions-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673-4680. Weisburg, W.G., Tully, J.G., Rose, D.L., Petzel, J.P., Oyaizu, H., Yang, D., Mandelco, L., Sechrest, J., Lawrence, T.G., Van Etten, J., Maniloff, J. and Woese, C.R. (1989) A phylogenetic analysis of the mycoplasmas: basis for their classification. J. Bacteriol. 171, 6455-6467. Westberg, J., Persson, A., Holmberg, A., Goesmann, A., Lundeberg, J., Johansson, K.E., Pettersson, B. and Uhlen, M. (2004) The genome sequence of Mycoplasma mycoides subsp. mycoides SC type strain PG1T, the causative agent of contagious bovine pleuropneumonia (CBPP). Genome Res. 14, 221-227. Wheeler, D.L., Church, D.M., Federhen, S., Lash, A.E., Madden, T.L., Pontius, J.U., Schuler, G.D., Schriml, L.M., Sequeira, E., Tatusova, T.A. and Wagner, L. (2003) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 31, 28-33. Woese, C.R. (1987) Bacterial evolution. Microbiol. Rev. 51, 221-271. Zhang, Q., Soares de Oliveira, S., Colangeli, R. and Gennaro, M.L. (1997) Binding of a novel host factor to the pT181 replication enhancer. J. Bacteriol. 179, 684-688. Zuo, Y. and Deutscher, M.P. (2001) Exoribonuclease superfamilies: structural analysis and phylogenetic distribution. Nucleic Acids Res. 29, 1017-1026.

168

ID

1 2 3

4

AY-WB and S. kunkelii homologues absent from mycoprotdb Source ORF ID a Length b AY-WB 246_1F 716 S. kunkelii 100_74F 719 AY-WB 247_200F 321 S. kunkelii 109_633F 313 AY-WB 247_187F 161 S. kunkelii

98_127R

159

AY-WB

247_205R

85

S. kunkelii

107_113R

91

Best hit against NCBI nr database Accession #, Homology 29377522, PNPase 15902560, PNPase 27468441, cmp-binding factor 1 16078057, cmp-binding factor 1 20806575, cytosine/adenosine deaminases 20806575, cytosine/adenosine deaminases 541414, conserved hypothetical protein YlxR 541414, conserved hypothetical protein YlxR

Cellular location c Organism Enterococcus faecalis Streptococcus pneumoniae Staphylococcus aureus Bacillus subtilis Thermoanaerobacter tengcongensis Thermoanaerobacter tengcongensis Bacillus subtilis

E value 1e-180 0 1e-39 3e-59 4e-26

cytoplasm cytoplasm cytoplasm cytoplasm cytoplasm

1e-16

cytoplasm

2e-14

cytoplasm

Bacillus subtilis

2e-06

cytoplasm

169

Table 5.1 Four AY-WB and S. kunkelii homologues that were absent from mycoprotdb consisting of the whole genome sequences of M. genitalium, M. pneumoniae, U. urealyticum, M. pulmonis, M. penetrans, and M. gallisepticum.

a

ORF ID identified by ORF Extractor program. Length of deduced amino acid sequence. c Cellular location was determined by pSORT (Nakai and Horton, 1999). b

ID

1 2

Proteins shared between AY-WB and S. kunkelii Organism ORF ID a Length b AY-WB 235_4R 414 S. kunkelii 94_78R 414 AY-WB 248_157F 889 77_20F

910

3

AY-WB

247_48R

745

4

S. kunkelii AY-WB

106_196R 246_186F

749 528

S. kunkelii

96_41R

509

170

S. kunkelii

Best hit against NCBI nr database Accession #, Homology 15613820, BH1257 unknown conserved 15613820, BH1257 unknown conserved 15673239, cation-transporting P-ATPase (EC 3.6.3.2) 30022224, Mg2+ transport ATPase, P type (EC 3.6.3.2) 10443847, ppGpp synthetase 6647842, ppGpp synthetase 28378886, Hypothetical exported protein/HAD hydrolase 401696, Hypothetical exported protein/HAD hydrolase

Cellular location c Organism Bacillus halodurans Bacillus halodurans Lactococcus lactis

E-value 1e-94 2e-87 2e-179

cytoplasm cytoplasm membrane

Bacillus cereus

0

membrane

Geobacillus stearothermophilus Spiroplasma citri Lactobacillus plantarum

e-154

cytoplasm

0 1e-116

Mycoplasma mycoides

1e-92

cytoplasm membrane or outside membrane or outside

Table 5.2 Identities of AY-WB and S. kunkelii proteins that are more similar to each other than to proteins in mycoprotdb.

a

ORF ID identified by ORF Extractor program. Length of deduced amino acid sequence. c Cellular location was determined by pSORT (Nakai and Horton, 1999). b

Fig. 5.1 Algorithms employed to extract proteins that are common between the insect-transmitted plant pathogens Aster Yellows Witches’ Broom (AY-WB) and Spiroplasma kunkelii but are absent from five Mycoplasma spp. and Ureaplasma urealyticum. See Materials and Methods for details. Similar dataset consists of proteins that are similar in AY-WB and S. kunkelii, while unique dataset consists of proteins that are similar between AY-WB and S. kunkelii but absent from five Mycoplasma spp. and Ureaplasma urealyticum. Shaded text boxes are operations with the programs indicated in parentheses. Open text boxes are datasets either as input or output of the operations. Bacterial and mold mitochondria genetic codes are from NCBI taxonomy databases (Benson et al., 2003; Wheeler et al., 2003).

171

Fig. 5.2 Graphical representation of comparative analysis results. A. Negative logistic plots of the top E values of the BLAST search using AY-Skdb as query and Skdb (x-axis) and mycoprotdb (y-axis) as databases. B. Negative logistic plots of the top E values of the BLAST search using Sk_AYdb as query and AYdb (x-axis) and mycoprotdb (y-axis) as databases. Based on criteria described in Materials and Methods, data points in diamonds (◆ or ◊) are proteins shared between AY-WB and S. kunkelii but absent from five Mycoplasma spp. and Ureaplasma urealyticum, and data points in open triangles (∆) are proteins present in AY-WB, S. kunkelii and Mycoplasma spp. and Ureaplasma urealyticum. Data points in solid diamond (◆) are proteins having similar lengths and annotations, which are detailed in Table 5.1. Data points in open circles (O) are AY-WB or S. kunkelii proteins that are more similar to each other than to counterparts in five Mycoplasma spp. and U. urealyticum, which are detailed in Table 5.2.

172

Fig. 5.3 Phylogenetic analyses of proteins that are present in insect-transmitted plant pathogenic AY-WB and S. kunkelii but absent from animal and human pathogenic mycoplasmas. Phylogenetic trees were generated following the procedure described in Materials and Methods. Bars under the trees represent evolutionary distances. A. Phylogenetic tree derived from 16S rDNA sequences. B. Phylogenetic tree derived from polynucleotide phosphorylase (PNPase). C. Phylogenetic tree derived from cmp-binding factor (CBF). Protein sequences were obtained from GenBank and aligned with ClustalW (Thompson et al., 1994). The alignments were used for parsimony analysis in PAUP version 4.0 (Swofford, 2001). Trees were bootstrapped 1,000 times and the bootstrap values above 50% are indicated as a percentage at the branches. Accession numbers for protein sequences follows. (A) Acholeplasma laidlawii, M23932; Anaeroplasma abactoclasticum, M25050; Asteroleplasma anaerobium, M22351; Bacillus subtilis, AB042061; Mesoplasma entomophilum, AF305693; Mycoplasma capricolum, U26048; M. gallisepticum, M22441; M. genitalium, X77334; M. hominis, AJ002268; M. mycoides, U26050; M. pulmonis, AF125582; M. sualvi, AF412988; Streptococcus pneumoniae, AY281083; Ureaplasma urealyticum, U06098. (B) Actinobacillus pleuropneumoniae, ZP_00134571; Bacillus halodurans, NP_243273; B. subtilis, NP_389551; Buchnera aphidicola, NP_777952; Deinococcus radiodurans, NP_295786; Escherichia coli, NP_312072; Haemophilus influenzae, NP_438401; Mycobacterium bovis, CAD94991; Salmonella enterica, NP_806878; S. typhimurium, AAL22154; Shigella flexneri, NP_708965; Streptococcus agalactiae, CAD45842; Str. mutans, NP_720625; Str. pyogenes, BAC64773; Thermotoga maritime, NP_229146; Thermus thermophilus, CAB06341; Vibrio vulnificus, NP_935490; Xylella fastidiosa, NP_778440; Xanthomonas axonopodis, NP_642994; Yersinia enterocolitica, CAA71697; Y. pestis, NP_668031. (C) Bacillus subtilis, CAB12833; B. cereus, NP_830807; Clostridium perfringens, NP_560939; C. tetani, NP_783025; Lactococcus lactis, NP_268079; Methanococcus jannaschii, NP_247831; Staphylococcus aureus, NP_374949; Sta. Epidermidis, NP_765078; Streptococcus mutans, NP_720807; Str. pneumoniae, NP_359386; Str. pyogenes, NP_268621;

173

Fig. 5.4 Phylogenetic analysis for AtA (AAA type ATPase). Phylogenetic trees were generated following the procedure described in Materials and Methods. Bars under the trees stand for evolutionary distances. Protein sequences were obtained from GenBank and aligned with ClustalW (Thompson et al., 1994). The alignments were used for parsimony analysis in PAUP version 4.0 (Swofford, 2001). Trees were bootstrapped 1,000 times and the bootstrap values above 50% are indicated as a percentage at the branches. Accession numbers for protein sequences follows. Bacillus halodurans, NP_242123; B. subtilis, NP_390631; Clostridium acetobutylicum, NP_348297; Enterococcus faecalis, NP_815655; E. faecium, ZP_00036045; Listeria innocua, NP_470885; L. monocytogenes, NP_465039; Mycobacterium leprae, CAA19102; Mycoplasma gallisepticum, NP_853308; M. penetrans, NP_757529; Staphylococcus aureus, NP_646394; Streptococcus mutans, NP_722348; Str. pneumoniae, NP_346223; Ureaplasma urelyticum, NP_078028

174

CHAPTER 6 Functional genomics identifies phytoplasma effector proteins Xiaodong Bai1, Valdir Ribeiro Correa1, Jianhua Zhang1, Michael M. Goodin2, Sophien Kamoun3, and Saskia A. Hogenhout1

1

Department of Entomology, The Ohio State University - OARDC, Wooster, OH 44691 2 Department of Plant Pathology, University of Kentucky, Lexington, KY 40546 3 Department of Plant Pathology, The Ohio State University – OARDC, Wooster, OH 44691

175

6.1 Abstract Phytoplasmas are insect-transmitted plant pathogenic bacteria that have a broad plant host range and induce a variety of symptoms that suggest interference with plant development. The recently completed genome of the Candidatus Phytoplasma asteris strain aster yellows witches' broom (AY-WB) phytoplasma was mined for the presence of genes encoding secreted proteins that are candidate effector proteins involved in interaction with plant and insect cell components. In total, 56 genes encoded putative secreted proteins based on the presence of an N-terminal signal peptide and lack of transmembrane domains, and were transiently expressed in Nicotiana benthamiana plants. The functional analysis in plants resulted in the identification of 17 putative phytoplasma effector proteins that may directly or indirectly interact with plant components. Five putative effector proteins contained putative nuclear localization signals (NLSs). Yellow fluorescence protein (YFP) fusions of one protein (A11) targeted the plant cell nuclei and of another protein (A30) the nucleoli. The genes encoding these two proteins were expressed during phytoplasma infection of insects and plants. Further, nuclear transport of A11 was inhibited in N. benthamiana plants in which the expression of the gene for importin α was knocked down. Finally, transcription profiling studies indicated that A11 differentially regulated 53 tomato genes, including several transcription factors involved in plant development. These results supported the hypothesis that A11 was an effector protein that manipulates plant components. This study, for the first time, employed the combination of bioinformatics and functional genomics to study phytoplasma pathogenesis.

176

6.2 Introduction The aster yellows witches' broom (AY-WB) phytoplasma (Zhang et al., 2004) is a strain of Candidatus Phytoplasma asteris (previously known as the Aster Yellows 16SrI group) that is largest group of Candidatus Phytoplasma. Phytoplasmas belong to the Class Mollicutes of which members are characterized by the lack of cell wall, small genomes with low GC contents (Razin et al., 1998), and likely evolved from a Grampositive bacterial ancestor by reductive evolution (Weisburg et al., 1989; Woese, 1987). Phytoplasmas are insect-transmitted plant pathogens and mainly reside in plant phloem tissues. They are intracellular pathogens that invade and replicate in the cells of insect vectors and plant hosts, and cause severe losses of crops, such as lettuce and carrot, and ornamental plants, such as China aster. The phytoplasma-infected plants show symptoms including phyllody (development of floral parts into leafy structures), virescence (greening of normally white tissue), and shoot proliferation. These symptoms may be due to the interference of phytoplasma proteins with plant hormone synthesis and utilization (Chang, 1998). Indeed, bacteria secreted proteins are able to interfere with plant metabolic or signaling pathways. For instance, characterized type III effector proteins from Gram-negative plant pathogenic bacteria, such as HopPtoF from Pseudomonas syringae pv. phaseolicola (Jackson et al., 1999), suppress plant defense-associated hypersensitive response (HR). Some effectors were able to modify plant signal transduction pathways (Collmer et al., 2002). Unlike many Gram-negative that are extracellular and need type III secretion systems to cross host cell membrane for delivery

177

of effector proteins, phytoplasmas are intracellular pathogens and hence their secreted proteins can directly interact with cellular components of plant cells. Phytoplasmas cannot be cultured in cell-free media, and there are no available genetic tools, making it difficult to study the biology, physiology, and pathogenicity mechanisms of phytoplasmas. However, the recent accumulation of phytoplasma genome sequence data provided an excellent basis for functional genomic screens to identify phytoplasma proteins involved in pathogenesis. The genome-sequencing project of AYWB phytoplasma initiated by the Department of Entomology and the Department of Plant Pathology at The Ohio State University in collaboration with Integrated Genomics, Inc. is completed (Bai et al., in preparation; refer to Chapter 3 of this dissertation for details), and was used for the study described herein. Further, the complete genome sequence of Onion Yellows (OY) phytoplasma, another strain of Candidatus Phytoplasma asteris, was published (Oshima et al., 2004). Here we report, for the first time, the successful identification of phytoplasma candidate effector proteins by a combination of bioinformatics and high throughput functional analysis. Candidate effector proteins were predicted using computer-assisted algorithms, and thereafter, analyzed in plants using the well-established Potato virus X (PVX)-based transient expression system (Qutob et al., 2002) and plant localization system using fluorescence protein fusions (Goodin et al., 2002). Seventeen proteins were identified to induce necrosis on Nicotiana benthamiana leaves and two proteins targeted plant cell nuclei and corresponding transcripts of these two proteins were detected in phytoplasma-infected insects and plants. One protein was dependent on plant importin α

178

for transport into plant nuclei. It was also able to change transcription profiles of several tomato genes, including transcription factors.

6.3 Materials and Methods 6.3.1 Bacteria and plants Agrobacterium tumefaciens GV3101 (Holsters et al., 1980) and Escherichia coli XL1-blue were routinely grown at 28 oC and 37 oC, respectively, in Luria-Bertani (LB) media supplemented with appropriate antibiotics (Sambrook et al., 1989). Nicotiana benthamiana plants were used for in planta functional assay. Lycopersicon esculentum OH7814 was used for microarray. Inoculated or agroinfiltrated N. benthamiana and L. esculentum plants were maintained in Biosafety Level 2 greenhouses. AY-WB phytoplasma was maintained by serial transmission to China aster (Callistephus chinensis) plants by aster leafhopper (Macrosteles quadrilineatus L.) in greenhouse and growth chambers.

6.3.2 Data mining AY-WB phytoplasma gapped genome sequence was obtained from the AY-WB phytoplasma genome sequencing project website (http://www.oardc.ohiostate.edu/phytoplasma). Open reading frames (ORFs) were predicted using ORF extractor program (Bai et al., 2004). The longest ORF among the set of ORFs with the same stop codon was extracted using a perl script. Nucleotide sequences were translated into amino acid sequences, and the presence of signal peptides in AY-WB phytoplasma amino acid

179

sequences was examined by SignalP 2.0 program (Nielsen et al., 1997). The proteins containing signal peptides were subject to selection based on the criteria of ORF length and lack of the transmembrane domains predicted by TMHMM2.0 program (Krogh et al., 2001), resulting in phytoplasma candidate effector proteins. The candidate effector proteins were analyzed by web-based BLAST (Basic Local Alignment Search Tool) (Altschul et al., 1997) against NCBI (National Center for Biotechnology Information) nr (non-redundant) database. The presence of nuclear localization signal (NLS) in the AYWB candidate effector proteins was predicted using ScanProsite (Gattiker et al., 2002) and PredictNLS (Cokol et al., 2000) programs.

6.3.3 Construction of recombinant A. tumefaciens binary PVX vectors ORFs encoding the moiety of candidate effector proteins excluding signal peptides were PCR-amplified and cloned into the PVX vector pGR106 (Jones et al., 1999). Gene-specific primers complementary to the 5' and 3' ends of each respective ORF were designed to include restriction site overhangs for cloning into the pGR106 vector. Amplification products were digested with appropriate restriction enzymes, sizefractioned and purified from 1% agarose gels using QIAprep gel extraction kit (Qiagen, Valencia, CA). Purified products were ligated into pGR106. The resulted binary expression constructs were electro-transformed into A. tumefaciens GV3101. The cells were allowed to grow for 2 days at 28 oC in LB agar plates supplemented with 50 g ml-1 kanamycin. The sequences of the cloned inserts were verified by DNA sequence analysis. Individual colonies were toothpick-inoculated onto the lower leaves of N. benthamiana

180

plants (Takken et al., 2000). The development of disease symptoms was recorded from 1 dpi (days post-inoculation) up to 21 dpi.

6.3.4 Construction of recombinant A. tumefaciens binary pGDY vectors NLS-containing candidate effector proteins of AY-WB were fused to N-terminus to yellow fluorescent protein (YFP) for determining the subcellular localization of phytoplasma proteins in plant cells. To this end, ORFs encoding the moiety of NLScontaining candidate effector proteins excluding signal peptides were amplified using gene-specific primers complementary to the 5' and 3' ends of each respective ORF and including restriction site overhangs for cloning into the pGDY vector (Goodin et al., 2002). Ligations were directly electro-transformed into A. tumefaciens GV3101 as described above. Agro-infiltration of individual GV3101 colonies was conducted as described (Goodin et al., 2002). Leaves were harvested at 48-72 h after infiltration and examined by laser-scanning confocal microscopy using a Leica TCS SP2 filter-free spectral confocal and multiphoton microscope.

6.3.5 Tobacco rattle virus (TRV)-mediated virus-induced gene silencing (VIGS) To determine the dependence of YFP:A11 on N. benthamiana importin α genes for transport into plant nuclei, the importin α genes were silenced using TRV-mediated VIGS (Ratcliff et al., 2001; Liu et al., 2002). The importin α genes were amplified from the N. benthamiana cDNA library (Kanneganti et al., in preparation) using gene-specific primers designed to contain overhangs with appropriate restriction recognition sites for

181

cloning into the pTV00 vector, which allows production of the RNA2 portion of the TRV genome. The pTV00 constructs were introduced into A. tumefaciens GV3101 by electrotransformation, and the transformed cells were propagated at 28 oC. The A. tumefaciens GV3101 transformants were infiltrated into N. benthamiana leaves simultaneously with A. tumefaciens GV3101 containing the pBINTRA6 vector, which allows production of the RNA1 portion of the TRV genome (Ratcliff et al., 2001). To confirm the silencing of N. benthamiana importin α gene, reverse transcriptase – polymerase chain reaction (RTPCR) was performed on the total RNA samples collected from the upper leaves of N. benthamiana plants three weeks after agro-infiltration. At the same time, the matching upper leaves of N. benthamiana plants were infiltrated with pGD constructs and, two days later, examined by a laser scanning confocal microscope as described above. The green fluorescence protein (GFP)-tagged AtFib1 protein that localizes to the nucleolus (Barneche et al., 2000) was used as a negative control. The GFP:AtFib1 construct was kindly provided by Dr. Michael Goodin at University of Kentucky.

6.3.6 RNA isolation and RT-PCR Total RNA was isolated from leaves of healthy and AY-WB phytoplasmainfected China aster plants, and healthy and AY-WB phytoplasma-infected aster leafhoppers following the instructions of the ToTALLY RNA kit (Ambion, Austin, TX). Total RNA samples were treated with DNA-free kit (Ambion) to degrade residual genomic DNA contamination.

182

RT-PCR was performed to investigate whether the genes encoding the candidate effector proteins were expressed in AY-WB phytoplasma during the infection of insects and plants. RT-PCR was conducted with the OneStep RT-PCR kit (Qiagen) following the manufacturer's protocol using the gene-specific primers designed for pGDY cloning. The thermal cycler conditions were: 1 cycle of 50 oC for 30 min, 1 cycle of 95 oC for 15 min, 30 cycle of (94 oC for 1 min, 50 oC for 1 min, and 72 oC for 1 min), 1 cycle of 72 oC for 10 min. RT-PCR products were examined in 1% agarose gel following standard electrophoresis procedures (Sambrook et al., 1989).

6.3.9 Microarray study and data analysis Transcription profiling of tomato genes was conducted using an oligo-based microarray representing 15,925 tomato unigenes (http://www.tigr.org) and 12 matched and 12 mismatched 24-mers per gene (NimbleGen Systems Inc., Madison, WI). Cotyledons of young tomato OH7814 plants of 9 days old were toothpick-inoculated with PVX:A11 construct (A) and PVX vector (P) constructs. Mock-inoculated (M) plants served as controls. Plants were organized randomly in two different trays. Leaves from one plant in each tray were collected for total RNA isolation using Trizol agents (Invitrogen, Carlsbad, CA) following the manufacturer's instructions. For each treatment, two replicates were prepared. The quality of total RNA was assessed by running RNA gels and measuring the ratio of absorptions at 260 nm and 280 nm. The concentrations of the total RNA were measured by RiboGreen kit (Invitrogen). About 20 µg good-quality total RNA for each replicate of each treatment was sent to NimbleGen Systems Inc. for

183

labeling, hybridization and data retrieval. The experiment consisted of two replicates for each treatment. The hybridization data for each chip were collected from each hybridization file into a spreadsheet file. The genes whose hybridization values were negative or zero in any treatments were excluded from the analysis. In order to apply linear regression algorithm to the analysis, a base 2 log transformation was conducted to bring the hybridization data to a normal distribution. The variations between two replicates of each treatment (M, P, and A) were assessed in a quality control (QC) step by employing the univariate linear regression algorithm with the confidence interval of 95% using SAS. Genes with expression values falling out the confidence interval of 95% were considered too variable in hybridization intensities and collected for validation of the results of the pairwise comparison. A loop design was applied for the pairwise comparison of the means of the two replicates of each treatment (M, P, and A) using the univariate linear regression algorithm with the confidence interval of 99%. The genes falling out of the confidence interval of 99% were considered differentially expressed, and were validated by taking out the genes that were collected in the QC step.

6.4 Results 6.4.1 The AY-WB genome contained 56 candidate effector proteins. ORFs starting with standard start codon ATG were predicted from the gapped genome sequence of AY-WB using the ORF Extractor program, and subsequently the

184

longest ORF among the set of ORFs with the same stop codons was extracted using a perl script, resulting in 1986 ORFs. The SignalP 2.0 software (Nielsen et al., 1997) predicted the presence of N-terminal signal peptide sequences for 144 deduced proteins (Fig. 6.1). These proteins were considered candidate effector proteins because they may be secreted by the phytoplasma and interact with host factors. In total, 56 candidate effector proteins were selected for PVX-based in planta functional analysis based on the criteria of (i) absence of transmembrane domains to exclude membrane-bound proteins from this study and (ii) ORF lengths between 70 bp and 2,000 bp (i.e. the minimal and maximum size limitations of PVX). The 56 proteins (Fig. 6.1) were named A1 to A56. The 56 candidate effector proteins were annotated by sequence similarity searches against NCBI nr database. Fifteen proteins had no significant hits (E > 10-4) to any proteins in the database. Most of the others have homologs in the closely related OY phytoplasma genome. However, 32 of such proteins were annotated as “hypothetical protein” or “conserved hypothetical protein” in the OY phytoplasma genome.

6.4.2 Transiently expressed AY-WB candidate effector proteins induced necrosis in N. benthamiana leaves. To test the effect of AY-WB candidate effector proteins, gene fragments corresponding to the mature proteins were cloned into binary PVX vectors (Qutob et al., 2002) and transiently expressed in N. benthamiana leaves via toothpick inoculation of Agrobacterium carrying the binary PVX constructs (Takken et al., 2000). Agrobacterium will introduce recombinant PVX vectors into plant cells, and subsequent replication of

185

the PVX will allow intracellular production and systemic spread of virions and phytoplasma proteins in the plant. At 21 dpi, the PVX:A11 construct induced necrosis in addition to the mosaic symptom, while the PVX:A42 construct induced the accumulation of yet unknown substances observable under fluorescent microscope (Fig. 6.2). As expected, N. benthamiana leaves inoculated by PVX vector only, the negative control, showed typical mosaic symptoms of PVX, whereas the positive control, inf1 gene from Phytophthora infestans, a eukaryotic pathogen causing potato and tomato late blight, induced localized HR (Fig. 6.2.) (Kamoun et al., 1997). Overall, among the 37 phytoplasma genes successfully cloned and tested, 3 induced necrosis in addition to the same mosaic symptoms with PVX and 14 induced necrosis and delayed mosaic symptoms (Table 6.1). Thus, 17 AY-WB phytoplasma proteins alter PVX symptoms and therefore may directly or indirectly interact with plant components.

6.4.3 AY-WB proteins A11 and A30 targeted plant cell nuclei. Of the 56 candidate effector proteins selected for the PVX-based in planta expression studies, 5 proteins were predicted to contain plant nuclear localization signals (NLS) by PredictNLS (Cokol et al., 2000) and ScanProsite (Gattiker et al., 2002) programs. Given the fact that prokaryotic phytoplasmas do not have nucleus, these proteins (A03, A11, A22, A30, and A42) might target plant cell nuclei and affect the transcription of certain plant genes. To determine the subcellular localization of the five phytoplasma proteins in plants, gene fragments corresponding to mature portions of the ORFs were cloned into

186

the pGDY vectors and transiently expressed by agroinfiltration of N. benthamiana leaves (Goodin et al., 2002). Cloning of the A22 gene was not successful probably due to toxicity of A22 to E. coli. Two days after the agroinfiltration, plant leaves were detached and examined under a confocal microscope. Similar to the negative control (Fig. 6.3A), YFP only, YFP:A03 and YFP:A42 were distributed equally between the cytoplasm and the nucleus (data not shown). In contrast, YFP:A11 localized to the nuclei and YFP:A30 localized to the nucleoli of plant cells (Fig. 6.3A). To confirm the localization of YFP:A30 in nucleoli, YFP:A30 was co-infiltrated with GFP-tagged AtFib1, a protein from Arabidopsis thaliana whose localization in nucleoli was experimentally verified (Barneche et al., 2000). Fluorescence intensity scans (Fig. 6.3B) showed the identical intensity patterns of GFP:AtFib1 and YFP:A30 across the nucleus and nucleolus, thus confirming the localization of the YFP:A30 protein in plant cell nucleoli.

6.4.4 Nuclear import of YFP:A11 in N. benthamiana was importin α dependent. One of the pathways for nuclear import of proteins into plant nuclei depends on importin α and importin β, in which importin α is responsible for recognition of NLS and binding of the protein (Macara, 2001). Two N. benthamiana importin α gene homologs (importin α1 and importin α2) were identified from N. benthamiana EST (expressed sequence tag) sequences (Kanneganti et al., in preparation). To investigate whether phytoplasma effector proteins, A11 and A30, depend on importin system for transport into plant nuclei, the N. benthamiana importin α genes were silenced by TRVmediated VIGS (Ratcliff et al., 2001; Liu et al., 2002). Importin α-silenced plants grow 187

normally or have minor symptoms (data not shown). At two weeks after inoculation, RTPCR results demonstrated the complete silencing of importin α1 (NbImp1, Fig. 6.4A) and partial silencing of importin α2 (NbImp2, Fig. 6.4A) relative to the constitutively expressed plant tubulin genes. Subsequent infiltration of the N. benthamiana plants with the pGD constructs showed that YFP:A11 protein was distributed equally between the cytoplasm and the nucleus in importin α-silenced plants, which was different from the nuclear localization of the YFP:A11 protein in the non-silenced control plants (Fig. 6.4B). In contrast, the GFP:AtFib1 construct was localized in plant nucleoli (Barneche et al., 2000) in both healthy and importin α-silenced plants (Fig. 6.4B). These results suggested that YFP:A11 was dependent on N. benthamiana importin α gene products for the transport into plant nuclei.

6.4.5 A11 and A30 were expressed during AY-WB phytoplasma infection of plants and insects. Expression of the encoding genes of A11 and A30 during AY-WB phytoplasma infection of plants and insects was tested with RT-PCR. The transcripts of A11 and A30 with the expected sizes were detected in total RNA isolated from AY-WB phytoplasmainfected plants using gene-specific primers (Fig. 6.5). The transcripts of two other NLScontaining genes, A03 and A22, were also detected, whereas a transcript of A42 was not detectable in total RNA from AY-WB phytoplasma-infected plants (data not shown). The RT-PCR results of AY-WB-infected leafhoppers (M. quadrilineatus) showed that A11 and A30 gene transcripts of the expected sizes were detected in total RNA samples of 188

insects that acquired AY-WB from plants 1, 2, and 3 weeks prior (Fig. 6.6). Thus, A11 and A30 genes were expressed during AY-WB phytoplasma infection of plants and insects.

6.4.7 A11 affected the expression of tomato plant genes. Microarray experiments were conducted to evaluate the effect of A11 protein on plant gene expression profiles. NimbleGen chips representing 15,925 unigenes from L. esculentum were hybridized with total RNA isolated from mock-inoculated plants and plants inoculated with PVX only and PVX:A11 constructs. Statistical analysis of the results of 6 hybridizations, i.e. two replicates of three treatments, revealed that 26 tomato unigenes were up-regulated by A11 comparing to the PVX only control (Table 6.2) and 27 tomato unigenes were down-regulated by A11 comparing to the PVX only control (Table 6.3). Up-regulated genes included those of protein kinases involved in signal transduction and plant response to pathogen infection, such as receptor-like protein kinase (LE04258), putative receptor-like serine-threonine protein kinase (LE08855), leucine-rich repeat transmembrane protein kinase (LE10338), putative protein kinase (LE11683), and putative S-receptor kinase homolog 2 precursor (LE12428). These genes were up-regulated 3- to 7-fold in PVX-A11 inoculated plants compared to plants inoculated with the empty PVX vector. The tomato gene that was upregulated 15.8-fold had weak similarity to mraW methylase family protein (LE06170). Among the downregulated genes were CONSTANS-like protein (LE05848) and putative MADS-box protein (LE11148) that are both transcription factors (An et al., 2004; Becker and

189

Theissen, 2003), and the blue copper-binding protein (LE10147) that has various functions in the nucleus (Gruenbaum et al., 2003). Thus, these result showed that A11 can affect the expression levels of several proteins.

6.5 Discussion In this research, we identified 17 AY-WB candidate effector proteins that induced necrosis when expressed in plant cells. Unlike many Gram-negative bacterial pathogens that are typically located extracellularly and have type III secretion systems (TTSS) to deliver effector proteins into plant cells, Gram-positive bacteria apparently use the Secdependent pathway for delivery of virulence proteins as has been shown for Streptococcus pyogenes (Rosch and Caparon, 2004). Hence, the phytoplasmas that are related to the Gram-positive and are predominantly located intracellularly in plants cells, probably use the Sec-dependent pathway to deliver their effector proteins. Secreted phytoplasma proteins can then immediately interact with cell components or transported to cell nuclei or other cell organelles. Several candidate effector proteins of phytoplasmas, including A11 and A30, induce cell death as evidenced by local necrotic spots when transiently expressed in plants using the PVX-based expression system. Whereas phytoplasmas induce PR (pathogenicity-related) proteins (Zhong and Shen, 2004), it remains to be investigated whether phytoplasmas, and A11 and A30 induce HR. Effectors of Gram-negative bacteria are involved in HR response. For instance, AvrPto and AvrRpt2 of P. syringae pv. tomato (Tang et al., 1996; Leister et al., 1996) and AvrBs3 of Xanthomonas

190

campestris pv. vesicatoria (van den Ackerveken et al., 1996) could induce HR responses in resistant plants. Alternative explanations are that phytoplasma proteins are toxic to plant cells and induce cell death through direct interactions with plant cell components, or indirect induction of cell death by, for example, suppression plant defense system and subsequent increased virulence of PVX. The latter was found for Pseudomonas effector proteins HopPtoF and AvrPphF that were shown to suppress the defense-associated HR elicited by another bacterial effector protein (Jackson et al., 1999; Tsiamis et al., 2000). A11 and A30 contained plant NLSs and targeted plant cell nuclei when transiently expressed in plant cells. Because phytoplasmas are bacteria and do not have nuclei, it seems likely that A11 and A30 have adapted to functioning in eukaryotic cells. Indeed, A11 and A30 transcripts were detected in AY-WB-infected plants, and A11 is dependent on importinα for nuclear import in N. benthamiana cells. Further, A11 differently regulated 53 tomato genes. Several of the downregulated genes are transcription factors, including the nuclear zinc finger protein CONSTANS that acts in the phloem tissue of Arabidopsis thaliana and is involved in the regulation of Arabidopsis flowering (An et al., 2004), and the MADS-box proteins that belong to a family of transcription factors involved in multiple plant development processes (Becker and Theissen, 2003). A11 also downregulated with more than 6 fold the gene of the blue copper-binding protein, a protein similar to components of the nuclear lamina that is directly or indirectly involved in various nuclear activities, including DNA replication and transcription, cell cycle regulation, and cell development and differentiation (Gruenbaum et al., 2003). Thus, the nuclear localization of A11 and the differential regulation of nuclear plant genes by A11

191

support the hypothesis that A11 is an effector protein that manipulates plant components for efficient infection of AY-WB phytoplasma. Effector proteins of other bacteria also target plant cell nuclei. For instance, the AvrBs3 protein of Xanthomonas contained functional plant NLSs and targeted plant cell nuclei (Yang and Gabriel, 1995). Further, phytoplasmas induce various interesting symptoms, including phyllody, virescence and shoot proliferation (Zhang et al, 2004), indicative of phytoplasma interference with plant development, and may explain why A11 interacts with plant developmental proteins. It is not completely understood why phytoplasmas induce radical developmental symptoms in plants. However, it has been shown that phytoplasma-infected plants enhance the fitness of the leafhopper vectors of AY-WB by increasing the aster leafhopper (M. quadrilineatus) lifespan and numbers of offspring (Beanland et al., 2000). Since leafhoppers lay eggs on plant leaves, one expects that an increase in the number of leaves per plant, as occurs in phyllody (development of floral parts into leafy structures) and shoot proliferation, would result in an increase of leafhopper offspring. High numbers of leafhoppers is also important for phytoplasma survival as phytoplasmas are not seed-transmitted or passed on to next-generation leafhoppers, and therefore are completely dependent on leafhopper transmission from plant to plant. This study, for the first time, employed the combination of bioinformatics and functional genomics to study phytoplasma pathogenesis, and evidently resulted in the successful identification of candidate phytoplasma effector proteins. Because phytoplasmas cannot be cultured, genome sequencing and subsequent mining of genome

192

sequence data for high-throughput functional analysis of phytoplasma proteins is particularly powerful for understanding phytoplasma virulence.

6.6 Acknowledgments The authors thank Diane M. Hartzler and Angela D. Strock in the Department of Entomology, and Diane M. Kinney, Miaoying Tian, Edgar Huitema and Jorunn Bos in the Department of Plant Pathology at The Ohio State University – OARDC for technical support and constructive discussion. This research was supported by OSU-OARDC Research Enhancement Competitive Grant Program, Graduate Research Competition (2003-170) and Interdisciplinary Competition (2001-052).

6.7 References Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W. and Lipman, D.J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389-3402. An, H., Roussot, C., Suarez-Lopez, P., Corbesier, L., Vincent, C., Pineiro, M., Hepworth, S., Mouradov, A., Justin, S., Turnbull, C. and Coupland, G. (2004) CONSTANS acts in the phloem to regulate a systemic signal that induces photoperiodic flowering of Arabidopsis. Development 131, 3615-3626. Bai, X., Fazzolari, T. and Hogenhout, S.A. (2004a) Identification and characterization of traE genes of Spiroplasma kunkelii. Gene 336, 81-91. Barneche, F., Steinmetz, F. and Echeverria, M. (2000) Fibrillarin genes encode both a conserved nucleolar protein and a novel small nucleolar RNA involved in ribosomal RNA methylation in Arabidopsis thaliana. J. Biol. Chem. 275, 27212-27220.

193

Becker, A. and Theissen, G. (2003) The major clades of MADS-box genes and their role in the development and evolution of flowering plants. Mol. Phylogenet. Evol. 23, 464-489. Beanland, L., Hoy, C.W., Miller, S.A. and Nault, L.R. (2000) Influence of aster yellows phytoplasma on the fitness of aster leafhopper (Homoptera: Cicadellidae). Ann. Entomol. Soc. Am. 93, 271-276. Chang, C.-J. (1998) Pathogenicity of aster yellows phytoplasma and Spiroplasma citri in periwinkle. Phytopathology 88, 1347-1350. Cokol, M., Nair, R. and Rost, B. (2000) Finding nuclear localization signals. EMBO Rep. 1, 411-415. Collmer, A., Lindeberg, M., Petnicki-Ocwieja, T., Schneider, D.J. and Alfano, J.R. (2002) Genomic mining type III secretion system effectors in Pseudomonas syringae yields new picks for all TTSS prospectors. Trends Microbiol. 10(10), 462-469. Edman, M., Jarhede, T., Sjostrom, M. and Wieslander, A. (1999) Different sequence patterns in signal peptides from mycoplasmas, other gram-positive bacteria, and Escherichia coli: a multivariate data analysis. Proteins 35, 195-205. Gattiker, A., Gasteiger, E. and Bairoch, A. (2002) ScanProsite: a reference implementation of a PROSITE scanning tool. Appl. Bioinformatics 1, 107-108. Goodin, M.M., Dietzgen, R.G., Schichnes, D., Ruzin, S. and Jackson, A.O. (2002) pGD vectors: versatile tools for the expression of green and red fluorescent protein fusions in agroinfiltrated plant leaves. Plant J. 31, 375-383. Gruenbaum, Y., Goldman, R.D., Meyuhas, R., Mills, E., Margalit, A., Fridkin, A., Dayani, Y., Prokocimer, M. and Enosh, A. (2003) The nuclear lamina and its functions in the nucleus. Int. Rev. Cytol. 226, 1-62. Holsters, M., Silva, B., van Vliet, F., Genetello, C., De Block, M., Dhaese, P., Depicker, A., Inze, D., Engler, G., Villarroel, R. and Vanmontagu, M. and Schell, J. (1980) The functional organization of the nopaline A. tumefaciens plasmid pTiC58. Plasmid 3, 212-230. Jackson, R.W., Athanassopoulos, E., Tsiamis, G., Mansfield, J.W., Sesma, A., Arnold, D.L., Gibbon, M.J., Murillo, J., Taylor, J.D. and Vivian, A. (1999) Identification of a pathogenicity island, which contains genes for virulence and avirulence, on a large native plasmid in the bean pathogen Pseudomonas syringae pathovar phaseolicola. Proc. Natl. Acad. Sci. USA. 96, 10875-10880.

194

Jones, L., Hamilton, A.J., Voinnet, O., Thomas, C.L., Maule, A.J. and Baulcombe, D.C. (1999) RNA-DNA interactions and DNA methylation in post-transcriptional gene silencing. Plant Cell 11, 2291-2301. Kamoun, S., van West, P., de Jong, A.J., de Groot, K.E., Vleeshouwers, V.G. and Govers, F. (1997) A gene encoding a protein elicitor of Phytophthora infestans is down-regulated during infection of potato. Mol. Plant Microbe Interact. 10, 13-20. Krogh, A., Larsson, B., von Heijne, G. and Sonnhammer, E.L. (2001) Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J. Mol. Biol. 305, 567-580. Leister, R.T., Ausubel, F.M. and Katagiri, F. (1996) Molecular recognition of pathogen attack occurs inside of plant cells in plant disease resistance specified by the Arabidopsis genes RPS2 and RPM1. Proc. Natl. Acad. Sci. USA. 93, 15497-15502. Liu, Y., Schiff, M. and Dinesh-Kumar, S.P. (2002) Virus-induced gene silencing in tomato. Plant J. 31, 777-786. Marcara, I.G. (2001) Transport into and out of the nucleus. Microbiol. Mol. Biol. Rev. 65, 570-594. Nielsen, H., Engelbrecht, J., Brunak, S. and von Heijne, G. (1997) Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage site. Protein Eng. 10, 1-6. Oshima, K., Kakizawa, S., Nishigawa, H., Jung, H., Wei, W., Suzuki, S., Arashida, R., Nakata, D., Miyata, S., Ugaki, M. and Namba, S. (2004) Reductive evolution suggested from the complete genome sequence of a plant-pathogenic phytoplasma. Nature Genet. 36, 27-29. Qutob, D., Kamoun, S. and Gijzen, M. (2002) Expression of a Phytophthora sojae necrosis-inducing protein occurs during transition from biotrophy to necrotrophy. Plant J. 32, 361-373. Ratcliff, F., Martin-Hernandez, A.M. and Baulcombe, D.C. (2001) Tobacco rattle virus as a vector for analysis of gene function by silencing. Plant J. 25, 237-245. Razin, S., Yogev, D. and Naot, Y. (1998) Molecular biology and pathogenicity of mycoplasmas. Microbiol. Mol. Biol. Rev. 62, 1094-1156. Rosch, J. and Caparon, M. (2004) A microdomain for protein secretion in Gram-positive bacteria. Science 304, 1513-1515.

195

Sambrook, J., Fritsch, E.F. and Maniatis, T. (1989) Molecular cloning: A laboratory manual, 2nd ed. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press. Takken, F.L., Luderer, R., Gabriels, S.H., Westerink, N., Lu, R., de Wit, P.J. and Joosten, M.H. (2000) A functional cloning strategy, based on a binary PVX-expression vector, to isolate HR-inducing cDNAs of plant pathogens. Plant J. 2000 24, 275-283. Tang, X., Frederick, R.D., Zhou, J., Halterman, D.A., Jia, Y. and Martin, G.B. (1996) Initiation of plant disease resistance by physical interaction of AvrPto and Pto kinase. Science 274, 2060-2062. Tsiamis, G., Mansfield, J.W., Hockenhull, R., Jackson, R.W., Sesma, A., Athanassopoulos, E., Bennett, M.A., Stevens, C., Vivian, A., Taylor, J.D. and Murillo, J. (2000) Cultivar-specific avirulence and virulence functions assigned to avrPphF in Pseudomonas syringae pv. phaseolicola, the cause of bean halo-blight disease. EMBO J. 19, 3204-3214. Van den Ackerveken G., Marois, E. and Bonas, U. (1996) Recognition of the bacterial avirulence protein AvrBs3 occurs inside the host plant cell. Cell 87, 1307-1316. Weisburg, W.G., Tully, J.G., Rose, D.L., Petzel, J.P., Oyaizu, H., Yang, D., Mandelco, L., Sechrest, J., Lawrence, T.G., van Etten, J., Maniloff, J. and Woese, C.R. (1989) A phylogenetic analysis of the mycoplasmas: basis for their classification. J. Bacteriol. 171, 6455-6467. Woese, C.R. (1987) Bacterial evolution. Microbiol. Rev. 51, 221-271. Yang, Y. and Gabriel, D.W. (1995) Xanthomonas avirulence/pathogenicity gene family encodes functional plant nuclear targeting signals. Mol. Plant Microbe Interact. 8, 627-631. Zhang, J., Hogenhout, S.A., Nault, L.R., Hoy, C.W. and Miller, S.A. (2004) Molecular and symptom analyses of phytoplasma strains from lettuce reveal a diverse population. Phytopathology 94, 842-849. Zhong, B.X. and Shen, Y.W. (2004) Accumulation of pathogenesis-related type-5 like proteins in phytoplasma-infected garland chrysanthemum Chrysanthemum coronarium. Acta. Biochim. Biophys. Sin. (Shanghai) 36, 773-779.

196

197

Gene Length GC SP a Cleavage NLS c Blast annotation (aa) content Score site b Accession # Organism E-value Description A01 117 0.22 1 37 N 39938944 Onion yellows phytoplasma 7e-19 hypothetical protein A02 161 0.26 1 41 N 39938945 Onion yellows phytoplasma 2e-12 hypothetical protein A03 95 0.21 0.985 31 Y 39938944 Onion yellows phytoplasma 7e-08 hypothetical protein A04 162 0.27 1 41 N 39938945 Onion yellows phytoplasma 1e-07 hypothetical protein A05 136 0.27 0.992 32 N 39939004 Onion yellows phytoplasma 4e-49 hypothetical protein A06 118 0.29 0.999 31 N 39938858 Onion yellows phytoplasma 5e-13 hypothetical protein A07 71 0.34 0.523 31 N no significant hits A08 150 0.20 0.977 30 N 39938878 Onion yellows phytoplasma 4e-47 hypothetical protein A09 203 0.29 0.999 30 N 39939048 Onion yellows phytoplasma 1e-69 hypothetical protein A10 202 0.29 0.95 30 N 9621765 Peanut witches'-broom 7e-04 RNA polymerase sigma factor phytoplasma A11 122 0.21 1 31 Y 39939063 Onion yellows phytoplasma 6e-05 hypothetical protein A12 252 0.22 0.998 42 N 39938647 Onion yellows phytoplasma 2e-17 hypothetical protein A13 93 0.17 0.995 32 N 39939176 Onion yellows phytoplasma 3e-12 hypothetical protein A14 143 0.21 0.775 32 N 39939013 Onion yellows phytoplasma 1e-48 ATP-dependent DNA helicase (partial, 13.7%) A15 381 0.30 0.999 34 N 39938578 Onion yellows phytoplasma e-168 ABC-type Mn/Zn transport system, periplasmic Mn/Zn-binding protein A16 148 0.23 0.925 26 N 39939193 Onion yellows phytoplasma 1e-42 hypothetical protein A17 209 0.22 0.999 32 N 39938539 Onion yellows phytoplasma 1e-71 hypothetical protein A18 335 0.22 0.518 32 N 39938800 Onion yellows phytoplasma e-128 hypothetical protein A19 187 0.25 0.979 32 N 39938857 Onion yellows phytoplasma 4e-27 hypothetical protein A20 269 0.22 0.903 39 N no significant hits A21 126 0.16 0.99 32 N no significant hits A22 212 0.32 0.964 21 Y 39939223 Onion yellows phytoplasma 6e-98 guanylate kinase A23

58

0.22

0.848

30

N

no significant hits

A24 A25 A26 A27 A28 A29

48 231 199 193 75 362

0.36 0.22 0.24 0.22 0.18 0.28

0.744 0.999 0.984 0.544 0.999 0.927

28 41 41 22 36 33

N N N N N N

no significant hits 39938886 Onion yellows phytoplasma 39938886 Onion yellows phytoplasma no significant hits no significant hits 39938756 Onion yellows phytoplasma

2e-19 3e-13

hypothetical protein hypothetical protein

e-172

A30 A31

106 115

0.21 0.18

0.983 0.862

34 30

Y N

39939176 39938783

9e-11 2e-45

uncharacterized BCR, containing RmuC domain hypothetical protein hypothetical protein

Onion yellows phytoplasma Onion yellows phytoplasma

PVX assay results Delayed mosaic with necrosis Delayed mosaic with necrosis Not cloned Delayed mosaic Delayed mosaic with necrosis No symptoms Mosaic same as negative control Not cloned Delayed mosaic Delayed mosaic with necrosis Delayed mosaic with necrosis Delayed mosaic No symptoms Mosaic same as negative control Not cloned Delayed mosaic with necrosis Delayed mosaic Not cloned Not cloned Mosaic same as negative control Delayed mosaic with necrosis Mosaic same as negative control with necrosis Mosaic same as negative control with necrosis Delayed mosaic Delayed mosaic Delayed mosaic Delayed mosaic Not cloned Not cloned Delayed mosaic with necrosis Delayed mosaic with necrosis

(Continued) Table 6.1 Summary of PVX assays of AY-WB phytoplasma candidate effector proteins

Table 6.1 (continued)

Gene A32 A33 A34 A35 A36 A37 A38 A39 A40 A41 A42

198 a

Length GC SP a Cleavage NLS c Blast annotation (aa) content Score site b Accession # Organism E-value Description 514 0.25 0.855 27 N 39938677 Onion yellows phytoplasma 6e-67 ABC-type dipeptide/oligopeptide transport system, periplasmic component 236 0.27 0.999 36 N 39938647 Onion yellows phytoplasma e-107 hypothetical protein 312 0.23 0.994 38 N 39938643 Onion yellows phytoplasma e-136 hypothetical protein 349 0.27 0.975 42 N 39938619 Onion yellows phytoplasma e-136 ABC-type uncharacterized transport system, periplasmic component 265 0.20 0.976 28 N 39938905 Onion yellows phytoplasma 5e-91 hypothetical protein 287 0.25 0.609 24 N 39938904 Onion yellows phytoplasma e-135 hypothetical protein 91 0.19 0.599 31 N no significant hits 203 0.28 0.996 30 N 39939048 Onion yellows phytoplasma 3e-73 hypothetical protein 114 0.23 0.939 31 N 39939028 Onion yellows phytoplasma 1e-12 hypothetical protein 132 0.21 0.635 32 N no significant hits 78 0.12 0.978 31 Y no significant hits

A43 A44 A45 A46

260 88 166 57

0.20 0.19 0.16 0.30

0.895 0.993 0.69 0.999

38 34 31 31

N N N N

39939027 Onion yellows phytoplasma 39939176 Onion yellows phytoplasma no significant hits no significant hits

A47 A48 A49 A50 A51 A52

60 66 282 229 107 292

0.37 0.23 0.17 0.22 0.17 0.27

0.998 0.796 0.979 0.505 0.767 0.982

35 32 31 49 32 28

N N N N N N

no significant hits 39938858 Onion yellows phytoplasma 39938832 Onion yellows phytoplasma 39938970 Onion yellows phytoplasma 39939027 Onion yellows phytoplasma 39938975 Onion yellows phytoplasma

A53 A54 A55 A56

221 125 270 101

0.32 0.18 0.24 0.26

0.998 0.99 0.767 0.78

35 31 31 32

N N N N

39939136 Onion yellows phytoplasma 39938535 Onion yellows phytoplasma 39939062 Onion yellows phytoplasma no significant hits

2e-22 3e-42

hypothetical protein hypothetical protein

4e-19 e-101 7e-90 1e-09 5e-60

hypothetical protein hypothetical protein hypothetical protein hypothetical protein ABC-type amino acid transport system, periplasmic component hypothetical protein hypothetical protein hypothetical protein

2e-48 3e-29 1e-43

PVX assay results Not cloned Not cloned Delayed mosaic with necrosis No symptoms Not cloned Cloned but not tested Delayed mosaic with necrosis Not cloned Delayed mosaic Not cloned Delayed mosaic, dark substance accumulation Not cloned Delayed mosaic Not cloned Mosaic same as negative control with necrosis Mosaic same as negative control Delayed mosaic with necrosis Not cloned Not cloned Not cloned Not cloned Delayed mosaic with necrosis Delayed mosaic with necrosis Delayed mosaic Delayed mosaic

Signal peptide (SP) scores were predicted by hidden Markov model (HMM) in SignalP2.0 program (Nielsen et al., 1997). Signal peptide cleavage sites were predicted by neural network (NN) in SignalP2.0 program (Nielsen et al., 1997). c Nuclear localization signals (NLS) were predicted by ScanProsite (Gattiker et al., 2002) and PredictNLS (Cokol et al., 2000) programs. b

#

Genes

199

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

LE00171 LE00473 LE02916 LE03181 LE03731 LE03884 LE04258 LE04471 LE04723 LE04949 LE06170 LE06677 LE06739 LE06800 LE07434 LE08855 LE10338 LE11457 LE11683 LE12193 LE12428 LE12485 LE13106

Fold Difference 3.09 3.46 5.96 3.80 10.71 7.41 3.07 3.20 3.99 10.32 15.81 3.28 4.77 5.45 7.23 4.53 4.75 7.98 7.41 5.63 6.39 3.80 6.16

24 25 26

LE13392 LE13792 LE14225

3.32 5.84 6.55

Blastx Acc. # 5669636 19256 20334373 4220541 15218215 12324199 8777368 2264373 18402561 34908910 15238896 51989474 7270437 31126772 1076611 18076583 18402209 10176726 6681335 6143896 50942589 30913024 17065936 18418072

Organism L. esculentum L. esculentum L. pimpinellifolium A. thaliana A. thaliana A. thaliana A. thaliana A. thaliana A. thaliana O. sativa cv. japonica A. thaliana N. benthamiana A. thaliana O. sativa cv. japonica N. sylvestris S. tuberosum A. thaliana A. thaliana A. thaliana A. thaliana O. sativa cv. japonica no significant hits A. thaliana S. tuberosum A. thaliana no significant hits

E-value 0.0 0.0 2e-92 1e-78 5e-83 2e-60 e-117 9e-83 4e-86 6e-63 2e-20 e-123 1e-32 2e-22 e-109 e-110 5e-66 e-107 9e-16 2e-06 5e-66 2e-11 1e-23 4e-66

Description ethylene-responsive elongation factor EF-Ts precursor heat shock protein cognate 70 cysteine protease Rab geranylgeranyl transferase like protein coatomer protein complex, subunit beta 2 (beta prime) putative calmodulin-binding protein receptor-like protein kinase NAM (no apical meristem)-like protein calcium-binding EF hand family protein P0691E06.5, unknown protein mraW methylase family protein (weak similarity) putative RNA-dependent RNA polymerase RdRP2 invertase-like protein (weak similarity) unknown protein, weak similarity peroxidase (EC 1.11.1.7), anionic, precursor putative receptor-like serine-threonine protein kinase leucine-rich repeat transmembrane protein kinase, putative diacylglycerol kinase ATDGK1 homolog putative protein kinase (weak similarity) putative translation initiation factor IF-2 (very weak similarity) putative S-receptor kinase (EC 2.7.1.-) homolog 2 precursor transcriptional activator DEMETER (DNA glycosylase-related protein DME) (very weak similarity) transmembrane protein (weak similarity) RNA recognition motif (RRM)-containing protein

Table 6.2 Genes that are up-regulated in PVX:A11 treated tomato plants comparing to PVX only treated tomato plants

#

Genes

200

1 2 3 4 5

LE00058 LE02043 LE05848 LE06226 LE06991

Fold Difference -2.81 -2.01 -3.14 -2.30 -2.39

6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

LE07685 LE08007 LE08458 LE08676 LE08883 LE09000 LE09282 LE10105 LE10147 LE10477 LE10529 LE10618 LE10958 LE11148 LE11151 LE11513 LE11576 LE12876 LE13274 LE13812

-2.12 -2.34 -2.58 -2.40 -2.99 -2.52 -2.15 -2.22 -6.73 -2.52 -3.61 -3.17 -10.41 -2.21 -2.99 -2.49 -4.19 -2.03 -3.31 -3.02

26 27

LE14900 LE15146

-3.55 -3.43

Acc. # 33413550 2119600 41323976 458547 50937023

Organism L. esculentum Flaveria pringlei Populus deltoides Manihot esculenta O. sativa cv. japonica

E-value 5e-73 1e-72 4e-45 8e-10 4e-31

1362095 124119 50920507 21593012 42559164 7486957 17065916 33301670 2129630 40287494 13377782 15010628 30694193 50947193 21726980 15218042 51968886 15222119 32815941 15239451

L. esculentum L. esculentum O. sativa cv. japonica A. thaliana N. tabacum A. thaliana Catharanthus roseus N. tabacum A. thaliana Capsicum annuum A. thaliana A. thaliana A. thaliana O. sativa cv. japonica Solanum phureja A. thaliana A. thaliana A. thaliana A. thaliana A. thaliana

3e-91 7e-39 e-168 e-114 e-121 4e-33 3e-85 e-147 7e-41 2e-58 8e-62 3e-57 e-115 4e-15 2e-75 3e-31 6e-36 6e-57 4e-42 2e-53

Blastx Description proteinase inhibitor II glycine cleavage system protein H precursor CONSTANS-like protein UTP-glucose glucosyltransferase (partial) SET-domain transcriptional regulator family-like protein (partial) oxidase like protein wound-induced proteinase inhibitor I precursor putative squalene monooxygenase putative C-4 sterol methyl oxidase 50S ribosomal protein L3, chloroplast precursor putative protein T13J8.190 geraniol 10-hydrooxylase Delta-7-sterol-C5(6)-desaturase blue copper-binding protein, 19K putative lesion-inducing protein fasciclin-like arabinogalactan protein 7 At1g68660/F24J5_4, unknown protein pectate lyase family protein putative MADS-box protein (weak similarity) pathogenesis related protein isoform b1 zinc finger (C3HC4-type RING finger) family protein unknown protein MATE efflux family protein At5g67620, unknown protein serine/threonine protein phosphatase 2A (PP2A) regulatory subunit B', putative

no significant hits no significant hits

Table 6.3 Genes that are down-regulated in PVX:A11 treated tomato plants comparing to PVX only treated tomato plants

Fig. 6.1 Mining of AY-WB phytoplasma genome sequences for candidate effector proteins. From 1,986 predicted ORFs, 144 ORFs were predicted as AY-WB phytoplasma candidate effector proteins. It includes 56 AY-WB phytoplasma candidate effector proteins selected for PVX studies (Black closed diamonds), 5 AY-WB phytoplasma proteins containing NLS (Blue closed squares), and the rest (Blue open circles).

201

Fig. 6.2 Representative plant symptoms after toothpick inoculations of Nicotiana benthamiana leaves with transformed Agrobacterium tumefaciens GV3101. Plant leaves (upper panel) were examined under the light mode (middle panel) and the fluorescent mode (lower panel) of a fluorescence microscope. PVX:A11 induced necrotic spots (black arrows) and accumulation of fluorescent compounds (white arrows) in N. benthamiana leaves. Plant leaves inoculated with PVX:A42 appeared healthy under the light mode. However, fluorescent image of the leaves showed dark areas (arrowheads) that were not visible under light mode, suggesting the accumulation of yet unknown substances. PVX and PVX:inf1 were included as controls, and showed the typical mosaic symptom of PVX and the clear hypersensitive response induced by Inf1.

202

Fig. 6.3 Confocal laser-scanning microscopy images demonstrating the subcellular localization of YFP fusions of NLS-containing AY-WB phytoplasma proteins upon agroinfiltration into N. benthamiana leaves. (A) YFP:A11 targeted the plant cell nuclei (arrow) and YFP:A30 targeted the nucleoli (arrowhead) of plant cells. The negative (neg.) control, YFP only, was equally distributed between the cytoplasm and the nucleus of plant cells. Bars represent 50 µm. (B) Fluorescence intensity scans showed co-localizations of GFP:AtFib1, a functional fibrillarin homolog that targets the nucleolus (Barneche et al., 2000), and YFP:A30. Linear fluorescence intensity scans across the nucleus (bars in confocal micrographs at left) showed that AtFib1 and A30 had similar intensity patterns (graphs at right).

203

Fig. 6.4 The transport of YFP:A11 was dependent on N. benthamiana (Nb) importin α gene products. (A) RT-PCR with N. benthamiana importin α1- and α2-specific primers confirmed the silencing of these two genes in N. benthamiana plants treated with TRV constructs containing Nb importin α inserts. Noninoculated (healthy) and TRV-treated plants served as negative control. RT-PCR with N. benthamiana tubulin gene-specific primers was included to control for equal amount of total RNA samples and RT-PCR reaction quality. (B) Confocal microscopy study showed that YFP:A11 was not transported into plant nuclei in N. benthamiana importin α1- and importin α2-silenced N. benthamiana plants, compared to healthy N. benthamiana plants. The disrupted transport of YFP:A11 resulted in the equal localization of YFP signals between plant cell cytoplasm and nuclei, which was similar to YFP only control (C). In contrast, the nuclear transport of the GFP fusion of AtFib1 (Barneche et al., 2000) was not inhibited, because GFP:AtFib1 was localized in plant nucleoli in both healthy and importin α-silenced plants. Therefore, AtFib1 is transported into plant nucleoli in an importin-independent manner. Bars = 50 µm.

204

Fig. 6.5 The genes of AY-WB phytoplasma candidate effector proteins A11 and A30 were expressed during AY-WB phytoplasma infection of China aster plants. Transcripts of both genes were detected by RT-PCR in total RNA samples from AY-WB infected plants (iRNA) but not those from healthy plants (hRNA). RNase-treated iRNA served as controls to test for the presence of genomic DNA contaminations in RNA samples.

Fig. 6.6 The genes of AY-WB phytoplasma candidate effector proteins A11 and A30 were expressed in AY-WB infected aster leafhoppers (Macrosteles quadrilineatus L.). Transcripts of both genes were detected by RT-PCR in total RNA samples from AY-WB phytoplasma infected aster leafhoppers that were reared on infected aster plants for 1 week (lanes 2), 2 weeks (lanes 3) and 3 weeks (lanes 4), but not in those from the leafhoppers that did not acquire AY-WB (lanes 1). AY-WB phytoplasma 16S rDNA primers served as controls to test for the presence of AY-WB genomic DNA contaminations in total RNA samples.

205

BIBILIOGRAPHY 1.

Achaz, G., Rocha, E.P.C., Netter, P. and Coissac, E. (2002) Origin and fate of repeats in bacteria. Nucleic Acids Res. 30, 2987-2994.

2.

Adams, M.D., Celniker, S.E., Holt, R.A., Evans, C.A., Gocayne, J.D., Amanatides, P.G., Scherer, S.E., Li, P.W., Hoskins, R.A., Galle, R.F., George, R.A., Lewis, S.E., Richards, S., Ashburner, M., Henderson, S.N., Sutton, G.G., Wortman, J.R., Yandell, M.D., Zhang, Q., Chen, L.X., Brandon, R.C., Rogers, Y.H., Blazej, R.G., Champe, M., Pfeiffer, B.D., Wan, K.H., Doyle, C., Baxter, E.G., Helt, G., Nelson, C.R., Gabor, G.L., Abril, J.F., Agbayani, A., An, H.J., Andrews-Pfannkoch, C., Baldwin, D., Ballew, R.M., Basu, A., Baxendale, J., Bayraktaroglu, L., Beasley, E.M., Beeson, K.Y., Benos, P.V., Berman, B.P., Bhandari, D., Bolshakov, S., Borkova, D., Botchan, M. R., Bouck, J., Brokstein, P., Brottier, P., Burtis, K.C., Busam, D.A., Butler, H., Cadieu, E., Center, A., Chandra, I., Cherry, J.M., Cawley, S., Dahlke, C., Davenport, L.B., Davies, P., de Pablos, B., Delcher, A., Deng, Z., Mays, A.D., Dew, I., Dietz, S.M., Dodson, K., Doup, L.E., Downes, M., DuganRocha, S., Dunkov, B.C., Dunn, P., Durbin, K.J., Evangelista, C.C., Ferraz, C., Ferriera, S., Fleischmann, W., Fosler, C., Gabrielian, A.E., Garg, N.S., Gelbart, W.M., Glasser, K., Glodek, A., Gong, F., Gorrell, J.H., Gu, Z., Guan, P., Harris, M., Harris, N. L., Harvey, D., Heiman, T.J., Hernandez, J.R., Houck, J., Hostin, D., Houston, K.A., Howland, T.J., Wei, M.H., Ibegwam, C., Jalali, M., Kalush, F., Karpen, G.H., Ke, Z., Kennison, J.A., Ketchum, K.A., Kimmel, B.E., Kodira, C.D., Kraft, C., Kravitz, S., Kulp, D., Lai, Z., Lasko, P., Lei, Y., Levitsky, A.A., Li, J., Li, Z., Liang, Y., Lin, X., Liu, X., Mattei, B., McIntosh, T.C., McLeod, M.P., McPherson, D., Merkulov, G., Milshina, N.V., Mobarry, C., Morris, J., Moshrefi, A., Mount, S.M., Moy, M., Murphy, B., Murphy, L., Muzny, D.M., Nelson, D.L., Nelson, D.R., Nelson, K.A., Nixon, K., Nusskern, D.R., Pacleb, J.M., Palazzolo, M., Pittman, G.S., Pan, S., Pollard, J., Puri, V., Reese, M.G., Reinert, K., Remington, K., Saunders, R.D., Scheeler, F., Shen, H., Shue, B.C., Siden-Kiamos, I., Simpson, M., Skupski, M.P., Smith, T., Spier, E., Spradling, A.C., Stapleton, M., Strong, R., Sun, E., Svirskas, R., Tector, C., Turner, R., Venter, E., Wang, A.H., Wang, X., Wang, Z.Y., Wassarman, D.A., Weinstock, G.M., Weissenbach, J., Williams, S.M., Woodage, T., Worley, K.C., Wu, D., Yang, S., Yao, Q.A., Ye, J., Yeh, R.F., Zaveri, J.S., Zhan, M., Zhang, G., Zhao, Q., Zheng, L., Zheng, X.H., Zhong, F.N., Zhong, W., Zhou, X., Zhu, S., Zhu, X., Smith, H.O., Gibbs, R.A.,

206

Myers, E.W., Rubin, G.M. and Venter, J.C. (2000) The genome sequence of Drosophila melanogaster. Science 287, 2185-2195. 3.

Agar, J.N., Yuvaniyama, P., Jack, R.F., Cash, V.L., Smith, A.D., Dean, D.R. and Johnson, M.K. (2000) Modular organization and identification of a mononuclear iron-binding site within the NifU protein. J. Biol. Inorg. Chem. 5, 167-177.

4.

Akerley, B.J., Rubin, E.J., Camilli, A., Lampe, D.J., Robertson, H.M. and Mekalanos, J.J. (1998) Systematic identification of essential genes by in vitro mariner mutagenesis. Proc. Natl. Acad. Sci. USA 95, 8972-8932.

5.

Alma, A., Bosco, D., Danielli, A., Bertaccini, A., Vibio, M. and Arzone, A. (1997) Identification of phytoplasmas in eggs, nymphs and adults of Scaphoideus titanus Ball reared on healthy plants. Insect Mol. Biol. 6, 115-121.

6.

Altschul, S.F., Boguski, M.S., Gish, W. and Wootton, J.C. (1994) Issues in searching molecular sequence databases. Nat. Genet. 6, 119-129.

7.

Altschul, S.F., Gish, W., Miller, W., Myers, E.W. and Lipman, D.J. (1990) Basic local alignment search tool. J. Mol. Biol. 215, 403-410.

8.

Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W. and Lipman, D.J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389-3402.

9.

Ammar, El-D., Fulton, D., Bai, X., Meulia, T. and Hogenhout, S.A. (2004) An attachment tip and pili-like structures in insect- and plant-pathogenic spiroplasmas of the class Mollicutes. Arch. Microbiol. 181, 97-105.

10.

An, H., Roussot, C., Suarez-Lopez, P., Corbesier, L., Vincent, C., Pineiro, M., Hepworth, S., Mouradov, A., Justin, S., Turnbull, C. and Coupland, G. (2004) CONSTANS acts in the phloem to regulate a systemic signal that induces photoperiodic flowering of Arabidopsis. Development 131, 3615-3626.

11.

Anbutsu, H. and Fukatsu, T. (2003) Population dynamics of male-killing and nonmale-killing spiroplasmas in Drosophila melanogaster. Appl. Environ. Microbiol. 69, 1428-1434.

12.

André, A., Maccheroni, W., Doignon, F., Garnier, M. and Renaudin J. (2003) Glucose and trehalose PTS permeases of Spiroplasma citri probably share a single IIA domain, enabling the spiroplasma to adapt quickly to carbohydrate changes in its environment. Microbiology 149, 2687-2696.

207

13.

Bai, X. and Hogenhout, S.A. (2002) A genome sequence survey of the mollicute corn stunt spiroplasma Spiroplasma kunkelii. FEMS Microbiol. Lett. 210, 7-17.

14.

Bai, X., Fazzolari, T. and Hogenhout, S.A. (2004a) Identification and characterization of traE genes of Spiroplasma kunkelii. Gene 336, 81-91.

15.

Bai, X., Zhang, J., Holford, I.R. and Hogenhout, S.A. (2004b) Comparative genomics identifies genes shared by distantly related insect-transmitted plant pathogenic mollicutes. FEMS Microbiol. Lett. 235, 249-258.

16.

Badger, J.H. and Olsen, G.J. (1999) CRITICA: coding region identification tool invoking comparative analysis. Mol. Biol. Evol. 16, 512-524.

17.

Barneche, F., Steinmetz, F. and Echeverria, M. (2000) Fibrillarin genes encode both a conserved nucleolar protein and a novel small nucleolar RNA involved in ribosomal RNA methylation in Arabidopsis thaliana. J. Biol. Chem. 275, 2721227220.

18.

Bateman, A., Coin, L., Durbin, R., Finn, R.D., Hollich, V., Griffiths-Jones, S., Khana, A., Marshall, M., Moxon, S., Sonnhammer, E.L.L., Holme, D.J.S., Yeats, C. and Eddy, S.R. (2004) The pfam protein families database. Nucleic Acids Res. 32, D138-D141.

19.

Bath, J., Wu, L.J., Errington, J. and Wang, J.C. (2000) Role of Bacillus subtilis SpoIIIE in DNA transport across the mother cell prespore division septum. Science 290, 995-997.

20.

Beanland, L., Hoy, C.W., Miller, S.A. and Nault, L.R. (2000) Influence of aster yellows phytoplasma on the fitness of aster leafhopper (Homoptera: Cicadellidae). Ann. Entomol. Soc. Am. 93, 271-276.

21.

Bébéar, C.-M., Aullo, P., Bové, J.M. and Renaudin, J. (1996) Spiroplasma citri virus SpV1: characterization of viral sequences present in the spiroplasmal host chromosome. Curr. Microbiol. 32, 134-140.

22.

Becker, A. and Theissen, G. (2003) The major clades of MADS-box genes and their role in the development and evolution of flowering plants. Mol. Phylogenet. Evol. 23, 464-489.

23.

Bendtsen, J.D., Nielsen, H., von Heijne, G. and Brunak, S. (2004) Improved prediction of signal peptides: SignalP 3.0. J. Mol. Biol. 340, 783-795.

24.

Benson, D.A., Karsch-Mizrachi, I., Lipman, D.J., Ostell, J. and Wheeler, D.L. (2003) GenBank. Nucleic Acids Res. 31, 23-27. 208

25.

Berg, H.C. (2002) How Spiroplasma might swim. J. Bacteriol. 184, 2063-2064.

26.

Berg, M. Melcher, U. and Fletcher, J. (2001) Characterization of Spiroplasmas citri adhesion related protein SARP1, which contains a domain of a novel family designated sarpin. Gene 275, 57-64.

27.

Beven, L. and Wroblewski, H. (1997) Effect of natural amphipathic peptides on viability, membrane potential, cell shape and motility of mollicutes. Res. Microbiol. 148, 163-75.

28.

Bingle, L.E. and Thomas, C.M. (2001) Regulatory circuits for plasmid survival. Curr. Opin. Microbiol. 4, 194-200.

29.

Bork, P., Ouzounis, C., Casari, G., Schneider, R., Sander, C., Dolan, M., Gilbert, W. and Gillevet, P.M. (1995) Exploring the Mycoplasma capricolum genome: a minimal cell reveals its physiology. Mol. Microbiol. 16, 955-967.

30.

Boutareaud, A., Danet, J.L., Garnier, M. and Saillard, C. (2004) Disruption of a gene predicted to encode a solute binding protein of an ABC transporter reduces transmission of Spiroplasmas citri by the leafhopper Circulifer haematoceps. Appl. Environ. Microbiol. 70, 3960-3967.

31.

Bové, J.M. (1997) Spiroplasmas: infectious agents of plants, arthropods and vertebrates. Wien. Klin. Wochenschr. 109, 604-612.

32.

Bové, J.M. and Garnier, M. (1997) In: Developments in Plant Pathology, Pathogen and Microbial Contamination Management in Micropropagation, Vol. 12 (Cassels, A.C., Ed.), pp. 45-60. Kluwer Academic Publishers, Dordrecht.

33.

Bové, J.M., Carle, P., Garnier, M., Laigret, F., Renaudin, J. and Saillard, C. (1989) Molecular and cellular biology of spiroplasmas, pp. 243-364. In R. F. Whitecomb and J. G. Tully (ed.), The mycoplasmas, vol. 5. Academic Press, New York.

34.

Bové, J.M., Renaudin, J., Saillard, C., Foissac, X. and Garnier, M. (2003) Spiroplasma citri, a plant pathogenic mollicute: relationships with its two hosts, the plant and the leafhopper vector. Annu. Rev. Phytopathol. 41, 482-500.

35.

Braun, E.J. and Sinclair, W.A. (1978) Translocation in phloem necrosis-diseased American elm seedlings. Phytopathology 68, 1733-1737.

36.

Bryant, C. and DeLuca, M. (1991) Purification and characterization of an oxygeninsensitive NAD(P)H nitroreductase from Enterobacter cloacae. J. Biol. Chem. 266, 4119-4125.

209

37.

Cabezon, E., Sastre, J.I. and de la Cruz, F.(1997) Genetic evidence of a coupling role for the TraG protein family in bacterial conjugation. Mol. Gen. Genet. 254, 400-406.

38.

Calcutt, M.J., Lewis, M.S. and Wise, K.S. (2002) Molecular genetic analysis of ICEF, an integrative conjugal element that is present as a repetitive sequence in the chromosome of Mycoplasma fermentans PG18. J. Bacteriol. 184, 6929-6941.

39.

Campos, N., Rodriguez-Concepcion, M., Seemann, M., Rohmer, M. and Boronat, A. (2001) Identification of gcpE as a novel gene of the 2-C-methyl-D-erythritol 4phosphate pathway for isoprenoid biosynthesis in Escherichia coli. FEBS Lett. 488, 170-173.

40.

Carpousis, A.J. (2002) The Escherichia coli RNA degradosome: structure, function and relationship in other ribonucleolytic multienzyme complexes. Biochem. Soc. Trans. 30, 150-155.

41.

Casjens, S. (1998) The diverse and dynamic structure of bacterial genomes. Annu. Rev. Genet. 32, 339-377.

42.

Castano, S., Blaudez, D., Desbat, B., Dufourcq, J. and Wrobleski, H. (2002) Secondary structure of spiralin in solution, at the air/water interface, and in interaction with lipid monolayers. Biochim. Biophys. Acta. 1562, 45-56.

43.

Catlin, P.B., Olson, E.A. and Beutel, J.A. (1975) Reduced translocation of carbon and nitrogen from leaves with symptoms of pear curl. J. Am. Soc. Hortic. Sci. 100, 184-187.

44.

Censini, S., Lange, C., Xiang, Z., Crabtree, J.E., Ghiara, P., Borodovsky, M., Rappuoli, R. and Covacci A. (1996) cag, a pathogenicity island of Helicobacter pylori, encodes type I-specific and disease-associated virulence factors. Proc. Natl. Acad. Sci. USA 93, 14648-14653.

45.

Chambaud, I., Heilig, R., Ferris, S., Barbe, V., Samson, D., Galisson, F., Moszer, I., Dybvig, K., Wroblewski, H., Viari, A., Rocha, E.P.C. and Blanchard, A. (2001) The complete genome sequence of the murine respiratory pathogen Mycoplasma pulmonis. Nucleic Acids Res. 29, 2145-2153.

46.

Chang, C.J. (1989) Nutrition and cultivation of spiroplasmas. In: The Mycoplasmas (Whitcomb, R.F. and Tully, J.G. Ed.). Vol. 5, pp. 201-241. Academic Press, New York, NY.

47.

Chang, C.-J. (1998) Pathogenicity of aster yellows phytoplasma and Spiroplasma citri in periwinkle. Phytopathology 88, 1347-1350. 210

48.

Charbonneau, D.L. and Ghiorse, W.C. (1984) Ultrastructure and location of cytoplasmic fibrils in Spiroplasma floricola. Curr. Microbiol. 10, 65-72.

49.

Chastel, C. and Humphery-Smith, I. (1991) Mosquito spiroplasma. In: Advances in disease vector research (Harris, K.F. Ed.). vol. 7, pp. 149-206. Springer-Verlag Inc., New York, NY.

50.

Chevalier, C., Saillard, C. and Bové, J.M. (1990) Organization and nucleotide sequences of the Spiroplasma citri genes of ribosomal protein S2, elongation factor Ts, spiralin, phosphofructokinase, pyruvate kinase, and an unidentified protein. J. Bacteriol. 172, 2693-2703.

51.

Chiang, S.L., Mekalanos, J.J. and Holden, D.W. (1999) In vivo genetic analysis of bacterial virulence. Annu. Rev. Microbiol. 53, 129-154.

52.

Chiykowski, L.N. and Sinha, R.C. (1990) Differentiation of MLO diseases by means of symptomology and vector transmission. Zbl. Bakt. Suppl. 20, 280-287.

53.

Christie, P.J. and Vogel, J.P. (2000) Bacterial type IV secretion: conjugation systems adapted to deliver effector molecules to host cells. Trends Microbiol. 8, 354-360.

54.

Clements, M.O., Eriksson, S., Thompson, A., Lucchini, S., Hinton, J.C., Normark, S. and Rhen, M. (2002) Polynucleotide phosphorylase is a global regulator of virulence and persistency in Salmonella enterica. Proc. Natl. Acad. Sci. USA 99, 8784-8789.

55.

Cokol, M., Nair, R. and Rost, B. (2000) Finding nuclear localization signals. EMBO Rep. 1, 411-415.

56.

Collmer, A., Lindeberg, M., Petnicki-Ocwieja, T., Schneider, D.J. and Alfano, J.R. (2002) Genomic mining type III secretion system effectors in Pseudomonas syringae yields new picks for all TTSS prospectors. Trends Microbiol. 10(10), 462469.

57.

Crowley, D.J. and Hanawalt, P.C. (2001) The SOS-dependent upregulation of uvrD is not required for efficient nucleotide excision repair of ultraviolet light induced DNA photoproducts in Escherichia coli. Mutat. Res. 485, 319-329.

58.

Dandekar, T., Huynen, M., Regula, J.T. Ueberle, B., Zimmermann, C.U., Andrade, M.A., Doerks,T., Sanchez-Pulido, L., Snel, B., Suyama, M., Yuan, Y.P., Herrmann, R. and Bork, P. (2000) Re-annotating the Mycoplasma pneumoniae genome sequence: adding value, function and reading frames. Nucleic Acids Res. 28, 32783288. 211

59.

Daniels, M.J. (1979) A simple technique for assaying certain microbial phytotoxins and its application to the study of toxins by Spiroplasma citri. J. Gen. Microbiol. 114, 323-328.

60.

Daniels, M.J. (1983) Mechanisms of spiroplasma pathogenicity. Annu. Rev. Phytopathol. 21, 29-43.

61.

Daniels, M.J., Longland, J.M. and Gilbart, J. (1980) Aspects of motility and chemotaxis in spiroplasmas. J. Gen. Microbiol. 118, 429-436.

62.

Davidson, A.L. and Chen, J. (2004) ATP-binding cassette transporters in bacteria. Annu. Rev. Biochem. 73, 241-268.

63.

de Oliveira, E., Magalhães, P.C., Gomide, R.L., Vasconcelos, C.A., Souza, I.R.P., Oliveira, C.M., Cruz, I. and Schaffert, R.E. (2002) Growth and nutrition of mollicute-infected maize. Plant Dis. 86, 945-949.

64.

Detmers, F.J.M., Lanfermeijer, F.C. and Poolman, B. (2001) Peptides and ATP binding cassette peptide transporters. Res. Microbiol. 152, 245-258.

65.

Dinesh-Kumar, S.P., Anandalakshmi, R., Marathe, R., Schiff, M. and Liu, Y. (2003) Virus-induced gene silencing. Methods Mol. Biol. 236, 287-294.

66.

Djordjevic, S.R., Forbes, W.A., Forbes-Faulkner, J., Kuhnert, P., Hum, S, Hornitzky, M.A., Vilei, E.M. and Frey, J. (2001) Genetic diversity among Mycoplasma species bovine group 7: clonal isolates from an outbreak of polyarthritis, mastitis, and abortion in dairy cattle. Electrophoresis 22, 3551-3561.

67.

Doi, M., Wachi, M., Ishino, F., Tomioka, S., Ito, M., Sakagami, Y., Suzuki, A. and Matsuhashi, M. (1988) Determination of the DNA sequence of the mreB gene and of the gene products of the mre region that function in formation of the rod shape of Escherichia coli cells. J. Bacteriol. 170, 4619-4624.

68.

Doi, Y., Teranaka, M., Yora, K. and Asuyama, H. (1967) Mycopalsma- or PLT group-like microorganisms found in the phloem elements of plants infected with mulberry dwarf, potato witches' broom, aster yellows, or paulownia witches' broom. Ann. Phytopathol. Soc. Jpn. 33, 259-266.

69.

Donovan, W.P. and Kushner, S.R. (1986) Polynucleotide phosphorylase and ribonuclease II are required for cell viability and mRNA turnover in Escherichia coli K-12. Proc. Natl. Acad. Sci. USA 83, 120-124.

212

70.

Duret, S., Berho, N., Danet, J.L., Garnier, M. and Renaudin, J. (2003) Spiralin is not essential for helicity, motility, or pathogenicity but is required for efficient transmission of spiroplasma citri by its leafhopper vector Circulifer haematoceps. Appl. Environ. Microbiol. 69, 6225-6234.

71.

Ebbert, M.A. and Nault, L.R. (1994) Improved overwintering ability in Dalbulus maidis (Homoptera: Cicadellidae) vectors infected with Spiroplasma kunkelii (Mycoplasmatales: Spiroplasmataceae). Environ. Entomol. 23, 634-644.

72.

Ebbert, M.A. and Nault, L.R. (2001) Survival in Dalbulus leafhopper vectors improves after exposure to maize stunting pathogens. Entomol. Exp. Appl. 100, 311-324.

73.

Edman, M., Jarhede, T., Sjostrom, M. and Wieslander, A. (1999) Different sequence patterns in signal peptides from mycoplasmas, other gram-positive bacteria, and Escherichia coli: a multivariate data analysis. Proteins 35, 195-205.

74.

Emody, L., Kerenyi, M. and Nagy, G. (2003) Virulence factors of uropathogenic Escherichia coli. Int. J. Antimicrob. Agents S2, 29-33.

75.

Errington, J., Bath, J. and Wu, L.J. (2001) DNA transport in bacteria. Nat. Rev. Mol. Cell. Biol. 2, 538-545.

76.

Ewing, B. and Green, P. (1998) Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res. 8, 186-194.

77.

Ewing, B., Hillier, L., Wendl, M.C. and Green, P. (1998) Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res. 8, 175-185.

78.

Falah, M. and Gupta, R.S. (1997) Phylogenetic analysis of mycoplasmas based on Hsp70 sequences: cloning of the dnaK (hsp70) gene region of Mycoplasma capricolum. Int. J. Syst. Bacteriol. 47, 38-45.

79.

Fekkes, P. and Driessen, A.J. (1999) Protein targeting to the bacterial cytoplasmic membrane. Mol. Biol. Rev. 63, 161-173.

80.

Firrao, G., Smart, C.D. and Kirkpatrick, B.C. (1996) Physical map of the Western X-disease phytoplasma chromosome. J. Bacteriol. 178, 3985-3988.

81.

Fleischmann, R.D., Adams, M.D., White, O., Clayton, R.A., Kirkness, E.F., Kerlavage, A.R., Bult, C.J., Tomb, J.-F., Dougherty, B.A., Merrick, J.M., McKenney, K., Sutton, G.G., FitzHugh, W., Fields, C.A., Gocayne, J.D., Scott, J.D., Shirley, R., Liu, L.I., Glodek, A., Kelley, J.M., Weidman, J.F., Phillips, C.A., Spriggs, T., Hedblom, E., Cotton, M.D., Utterback, T., Hanna, M.C., Nguyen, D.T., 213

Saudek, D.M., Brandon, R.C., Fine, L.D., Fritchman, J.L., Fuhrmann, J.L., Geoghagen, N.S., Gnehm, C.L., McDonald, L.A., Small, K.V., Fraser, C.M., Smith, H.O. and Venter, J.C. (1995) Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science 269, 496-512. 82.

Fletcher, J., Wayadande, A.C., Melcher, U. and Ye, F. (1998) The phytopathogenic mollicute-insect vector interface: A closer look. Phytopathology 88, 1351-1358.

83.

Foissac, X., Bové, J.M. and Saillard, C. (1997) Sequence analysis of Spiroplasma phoeniceum and Spiroplasma kunkelii spiralin genes and comparison with other spiralin genes. Curr. Microbiol. 35, 240-243.

84.

Foissac, X., Danet, J.L., Saillard, C., Gaurivaud, P., Laigret, F., Pare, C. and Bové, J.M. (1997) Mutagenesis by insertion of Tn4001 into the genome of Spiroplasma citri: Characterization of mutants affected in plant pathogenicity and transmission to the plant by the leafhopper vector Circulifer haematoceps. Mol. Plant-Microbe Interact. 10, 454-461.

85.

Fraser, C.M., Gocayne, J.D., White, O., Adams, M.D., Clayton, R.A., Fleischmann, R.D., Bult, C.J., Kerlavage, A.R., Sutton, G.G., Kelley, J.M., Fritchman, J.L., Weidman, J.F., Small, K.V., Sandusky, M., Fuhrmann, J.L., Nguyen, D.T., Utterback, T., Saudek, D.M., Phillips, C.A., Merrick, J.M., Tomb, J., Dougherty, B.A., Bott, K.F., Hu, P.C., Lucier, T.S., Peterson, S.N., Smith, H.O. and Venter, J.C. (1995) The minimal gene complement of Mycoplasma genitalium. Science 270, 397-403.

86.

Fukatsu, T., Tsuchida, T., Nikoh, N. and Koga, R. (2001) Spiroplasma symbiont of the pea aphid, Acyrthosiphon pisum (Insecta: Homoptera). Appl. Environ. Microbiol. 67, 1284-1291.

87.

Gabridge, M.G., Chandler, K.F. and Daniels, M.J. (1985) Pathogenicity factors in mycoplasmas and spiroplasmas. In: The Mycoplasmas (Razin, S. and Barile, M.F. Ed.). Vol. 4, pp. 313-351. Academic Press, New York, NY.

88.

Gasparich, G.E. (2002) Spiroplasmas: evolution, adaptation and diversity. Front. Biosci. 7, 619-640.

89.

Gasparich, G.E., Whitcomb, R.F., Dodge, D., French, F.E., Glass, J. and Williamson, D.L. (2004) the genus Spiroplasma and its non-helical descendants: phylogenetic classification, correlation with phenotype and roots of the Mycoplasma mycoides clade. Int. J. Syst. Evol. Microbiol. 54, 893-918.

214

90.

Gattiker, A., Gasteiger, E. and Bairoch, A. (2002) ScanProsite: a reference implementation of a PROSITE scanning tool. Appl. Bioinformatics 1, 107-108.

91.

Gaur, N.K., Oppenheim, J. and Smith, I. (1991) The Bacillus subtilis sin gene, a regulator of alternate developmental processes, codes for a DNA-binding protein. J. Bacteriol. 173, 678-686.

92.

Gaurivaud, P., Laigret, F., Garnier, M. and Bové, J.M. (2000b) Fructose utilization and pathogenicity of Spiroplasma citri: characterization of the fructose operon. Gene 252, 61-69.

93.

Gaurivaud, P., Laigret, F., Garnier, M. and Bové, J.M. (2001) Characterization of FruR as a putative activator of the fructose operon of Spiroplasma citri. FEMS Microbiol. Lett. 198, 73-78.

94.

Gaurivaud, P., Danet, J.L., Laigret, F., Garnier, M. and Bové, J.M. (2000) Fructose utilization and phytopathogenicity of Spiroplasma citri. Mol. Plant-Microbe Interact. 13, 1145-1155.

95.

Geigenberger, P., Lerchl, J., Stitt, M. and Sonnewald, U. (1996) Phloem-specific expression of pyrophosphatase inhibits long-distance transport of carbohydrate and amino acids in tobacco plants. Plant Cell Environ. 19, 43-55.

96.

Gilad, R., Porat, A. and Trachtenberg, S. (2003) Motility modes of Spiroplasma melliferum BC3: a helical, wall-less bacterium driven by a linear motor. Mol. Microbiol. 47, 657-669.

97.

Glass, J.I., Lefkowitz, E.J., Glass, J.S., Heiner, C.R., Chen, E.Y. and Cassell, G.H. (2000) The complete sequence of the mucosal pathogen Ureaplasma urealyticum. Nature 407, 757-762.

98.

Goodin, M.M., Dietzgen, R.G., Schichnes, D., Ruzin, S. and Jackson, A.O. (2002) pGD vectors: versatile tools for the expression of green and red fluorescent protein fusions in agroinfiltrated plant leaves. Plant J. 31, 375-383.

99.

Gruenbaum, Y., Goldman, R.D., Meyuhas, R., Mills, E., Margalit, A., Fridkin, A., Dayani, Y., Prokocimer, M. and Enosh, A. (2003) The nuclear lamina and its functions in the nucleus. Int. Rev. Cytol. 226, 1-62.

100. Gundersen, D.E., Lee, I.-M., Schaff, D.A., Harrison, N.A., Chang, C.J., Davis, R.E. and Kingsbury, D.T. (1996) Genomic diversity and differentiation among phytoplasma strains in 16S rRNA groups I (aster yellows and related phytoplasmas) and III (X-disease and related phytoplasmas), Int. J. Syst. Bacteriol. 46, 64-75. 215

101. Gussie, J.S., Fletcher, J. and Claypool, P.L. (1995) Movement and multiplication of Spiroplasma kunkelii in corn. Phytopathology 85, 1093-1098. 102. Hackett, K.J. and Clark, T.B. (1989) Ecology of spiroplasmas. In: The Mycoplasmas (Whitcomb, R.F. and Tully, J.G. Ed.), Vol. V, Spiroplasmas, Acholeplasmas, and Mycoplasmas of Plants and Arthropods. pp. 113-200. Academic Press, San Diego, CA. 103. Harris, K.F. (1979) Leafhoppers and aphids as biological vectors: Vector-virus relationships. In: Leafhopper Vectors and Plant Disease Agents (Maramorosch, K. and Harris, K.F. Ed.). pp. 217-308. Academic Press, New York, NY. 104. Heger, A. and Holm, L. (2000) Rapid automatic detection and alignment of repeats in protein sequences. Proteins 41, 224-237. 105. Henriquez, P., Jeffers, D. and Seal, S. (1996) Detection of corn stunt mixed infections in Central America using ELISA and PCR techniques. Phytopathology 86, S58. 106. Himmelreich, R., Hilbert, H., Plagens, H., Pirkl, E., Li, B.C. and Herrmann, R. (1996) Complete sequence analysis of the genome of the bacterium Mycoplasma pneumoniae. Nucleic Acids Res. 24, 4420-4449. 107. Himmelreich, R., Plagens, H., Hilbert, H., Reiner, B. and Herrmann, R. (1997) Comparative analysis of the genomes of the bacteria Mycoplasma pneumoniae and Mycoplasma genitalium. Nucleic Acids Res. 25, 701-712. 108. Hirotsune, S., Yoshida, N., Chen, A., Garrett, L., Sugiyama, F., Takahashi, S., Yagami, K., Wynshaw-Boris, A. and Yoshiki, A. (2003) An expressed pseudogene regulates the messenger-RNA stability of its homologous coding gene. Nature 423, 91-96. 109. Hofreuter, D., Odenbreit, S. and Haas, R. (2001) Natural transformation competence in Helicobacter pylori is mediated by the basic components of a type IV secretion system. Mol. Microbiol. 41, 379-391. 110. Holsters, M., Silva, B., van Vliet, F., Genetello, C., De Block, M., Dhaese, P., Depicker, A., Inze, D., Engler, G., Villarroel, R. and Vanmontagu, M. and Schell, J. (1980) The functional organization of the nopaline A. tumefaciens plasmid pTiC58. Plasmid 3, 212-230. 111. Honzatko, R.B. and Fromm, H.J. (1999) Structure-function studies of adenylosuccinate synthase from Escherichia coli. Arch. Biochem. Biophys. 370, 18. 216

112. Hoy, C.W., Heady, S.E. and Koch, T.A. (1992) Species composition, phenology, and possible origins of leafhoppers (Cicadellidae) in Ohio vegetable crops. J. Econ. Entomol. 85, 2336-2343. 113. Hruska, A.J. and Gomez Peralta, M. (1997) Maize response to corn leafhopper (Homoptera: Cicadellidae) infestation and achaparramiento disease. J. Econ. Entomol. 90, 604-610. 114. IRPCM Phytoplasma/Spiroplasma Working Team – Phytoplasma Taxonomy Group. (2004) 'Candidatus Phytoplasma', a taxon for the wall-less, non-helical prokaryotes that colonize plant phloem and insects. Int. J. Syst. Evol. Microbiol. 54, 1243-1255. 115. Jackson, R.W., Athanassopoulos, E., Tsiamis, G., Mansfield, J.W., Sesma, A., Arnold, D.L., Gibbon, M.J., Murillo, J., Taylor, J.D. and Vivian, A. (1999) Identification of a pathogenicity island, which contains genes for virulence and avirulence, on a large native plasmid in the bean pathogen Pseudomonas syringae pathovar phaseolicola. Proc. Natl. Acad. Sci. USA. 96, 10875-10880. 116. Jacob, C., Nouzieres, F., Duret, S., Bové, J.M. and Renaudin, J. (1997) Isolation, characterization, and complementation of a motility mutant of Spiroplasma citri. J. Bacteriol. 179,4802-4810. 117. Jaffe, J.D., Stange-Thomann, N., Smith, C., DeCaprio, D., Fisher, S., Butler, J., Calvo, S., Elkins, T., FitzGerald, M.G., Hafez, N., Kodira, C.D., Major, J., Wang, S., Wilkinson, J., Nicol, R., Nusbaum, C., Birren, B., Berg, H.C. and Church, G.M. (2004) The complete genome and proteome of Mycoplasma mobile. Genome Res. 14, 1447-1461. 118. Jarausch, W., Saillard, C., Helliot, B., Garnier, M. and Dosba, F. (2000) Genetic variability of apple proliferation phytoplasmas as determined by PCR-RFLP and sequencing of a non-ribosomal fragment. Mol. Cell. Probes 14, 17-24. 119. Jensen, D.D. (1959) A plant virus lethal to its vector. Virology 8, 164-175. 120. Jones, L., Hamilton, A.J., Voinnet, O., Thomas, C.L., Maule, A.J. and Baulcombe, D.C. (1999) RNA-DNA interactions and DNA methylation in post-transcriptional gene silencing. Plant Cell 11, 2291-2301. 121. Jones, L.J., Carballido-Lopez, R. and Errington, J. (2001) Control of cell shape in bacteria: helical, actin-like filaments in Bacillus subtilis. Cell 104, 913-922.

217

122. Kamla, V., Henrich, B. and Hadding, U. (1996) Phylogeny based on elongation factor Tu reflects the phenotypic features of mycoplasmas better than that based on 16S rRNA. Gene 171, 83-87. 123. Kamoun, S., Dong, S., Hamada, W., Huitema, E., Kinney, D., Morgan, W.R., Styer, A., Testa, A. and Torto, T.A. (2002) From sequence to phenotype: functional genomics of Phytophthora. Can. J. Plant Pathol. 24, 6-9. 124. Kamoun, S., Hamada, W. and Huitema, E. (2003) Agrosuppression: a bioassay for the hypersensitive response suited to high-throughput screening. Mol. PlantMicrobe Interact. 16, 7-13. 125. Kamoun, S., van West, P., de Jong, A.J., de Groot, K.E., Vleeshouwers, V.G. and Govers, F. (1997) A gene encoding a protein elicitor of Phytophthora infestans is down-regulated during infection of potato. Mol. Plant Microbe Interact. 10, 13-20. 126. Kanamaru, K., Kashiwagi, S. and Mizuno, T. (1993) The cyanobacterium, Synechococcus sp. PCC7942, possesses two distinct genes encoding cationtransporting P-type ATPase. FEBS Lett. 330, 99-104. 127. Karplus, K., Karchin, R., Barrett, C., Tu, S., Cline, M., Diekhans, M., Grate, L., Casper, J. and Hughey, R. (2001) What is the value added by human intervention in protein structure prediction? Proteins 45, 86-91. 128. Kawakita, H., Saiki, T., Wei, W. Mitsuhashi, W., Watanabe, K. and Sato, M. (2000) Identification of mulberry dwarf phytoplasmas in the genital organs and eggs of leafhopper Hishimonoides sellatiformis. Phytopathology 90, 909-914. 129. King, K.W. and Dybvig, K. (1994) Mycoplasmal cloning vectors derived from plasmid pKMK1. Plasmid 31, 49-59. 130. Kirchhoff, H. (1992) Motility. In: Mycoplasmas: molecular biology and pathogenesis (Maniloff, J., McElhaney, R.N., Finch, L.R. and Baseman, J.B. Ed.). pp. 289-306. Am. Soc. Microbiol. Washington, DC. 131. Kirkpatrick, B.C. (1989) In Plant-Microbe Interactions: Molecular and Genetics Perspectives (Nester, E.W. Ed.), vol. 3, pp. 241-293. McGraw-Hill, New York, NY. 132. Kolenbrander, P.E., Andersen, R.N. and Ganeshkumar, N. (1994) Nucleotide sequence of the Streptococcus gordonii PK488 coaggregation adhesin gene, scaA, and ATP-binding cassette. Infect. Immun. 62, 4469-4480.

218

133. Kollar, A. and Seemüller, E. (1989) Base composition of the DNA of Mycoplasmalike organisms associated with various plant diseases. Phytopathology 127, 177186. 134. Kreuzer, J., Denger, S., Reifers, F., Beisel, C., Haack, K., Gebert, J. and Kubler, W. (1996) Adenovirus-assisted lipofection: efficient in vitro gene transfer of luciferase and cytosine deaminase to human smooth muscle cells. Atherosclerosis 124, 49-60. 135. Krogh, A., Larsson, B., von Heijne, G. and Sonnhammer, E.L. (2001) Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J. Mol. Biol. 305, 567-580. 136. Krom, B.P., Warner, J.B., Konings, W.N. and Lolkema, J.S. (2003) Transporters involved in uptake of di- and tricarboxylates in Bacillus subtilis. Antonie Van Leeuwenhoek 84, 69-80. 137. Kurtz, S. and Schleiermacher, C. (1999) REPuter: fast computation of maximal repeats in complete genomes. Bioinformatics 15, 426-427. 138. Kwon, M.-O., Wayadande, A.C. and Fletcher, J. (1999) Spiroplasma citri movement into the intestines and salivary glands of its leafhopper vector, Circulifer tenellus. Phytopathology 89, 1144-1151. 139. Lai, C.-M. and Kado, C.I. (2000) The T-pilus of Agrobacterium tumefaciens. Trends Microbiol. 8, 361-369. 140. Laigret, F., Carle, P., Carrere, N., Garnier, M. and Bové, J.M. (2000) 13th International Congress of International Organization of Mycoplasmologists, abstract. 48. 141. Lartigue, C., Duret, S., Garnier, M. and Renaudin, J. (2002) New plasmid vectors for specific gene targeting in Spiroplasma citri. Plasmid 48, 149-159. 142. Lauer, U. and Seemüller, E. (2000) Physical map of the chromosome of the apple proliferation phytoplasma. J. Bacteriol. 182, 1415-1418. 143. Lee, I.-M. and Davis, R.E. (1989) Serum-free media for cultivation of spiroplasmas. Can. J. Microbiol. 35, 1092-1099. 144. Lee, I.-M. Gundersen-Rindal, D.E., Davis, R.E., Bottner, K.D., Marcone, C. and Seemüller, E. (2004) 'Candidatus Phytoplasma asteris', a novel phytoplasma taxon associated with aster yellows and related diseases. Int. J. Syst. Evol. Microbiol. 54, 1037-1048.

219

145. Lee, I.-M., Davis, R.E. and Gundersen-Rindal, D.E. (2000) Phytoplasma: phytopathogenic mollicutes. Annu. Rev. Microbiol. 54, 221-255. 146. Leister, R.T., Ausubel, F.M. and Katagiri, F. (1996) Molecular recognition of pathogen attack occurs inside of plant cells in plant disease resistance specified by the Arabidopsis genes RPS2 and RPM1. Proc. Natl. Acad. Sci. USA. 93, 1549715502. 147. Lepka, P., Stitt, M., Moll, E. and Seemüller, E. (1999) Effect of phytoplasmal infection on concentration and translocation of carbohydrates and amino acids in periwinkle and tobacco. Physiol. Mol. Plant Pathol. 55, 59-68. 148. Li, P.L., Hwang, I., Miyagi, H., True, H. and Farrand, S.K. (1999) Essential components of the Ti plasmid trb system, a type IV macromolecular transporter. J. Bacteriol. 181, 5033-5041. 149. Li, Q.S., Gupta, J.D. and Hunt, A.G. (1998) Polynucleotide phosphorylase is a component of a novel plant poly(A) polymerase. J. Biol. Chem. 273, 17539-17543. 150. Liefting, L.W. and Kirkpatrick, B.C. (2003) Cosmid cloning and sample sequencing of the genome of the uncultivable mollicute, Western X-disease phytoplasma, using DNA purified by pulsed-field gel electrophoresis. FEMS Microbiol. Lett. 221, 203-211. 151. Lim, P.-O. and Sears, B.B. (1992) Evolutionary relationships of plant-pathogenic mycoplasmalike organism and Acholeplasma laidlawii deduced from two ribosomal protein gene sequences. J. Bacteriol. 174, 2606-2611. 152. Liu, Y., Schiff, M. and Dinesh-Kumar, S.P. (2002) Virus-induced gene silencing in tomato. Plant J. 31, 777-786. 153. Lopez, P., Martinez, S., Diaz, A., Espinosa, M. and Lacks, S.A. (1989) Characterization of the polA gene of Streptococcus pneumoniae and comparison of the DNA polymerase I it encodes to homologous enzymes from Escherichia coli and phage T7. J. Biol. Chem. 264, 4255-4263. 154. Luttinger, A., Hahn, J. and Dubnau, D. (1996) Polynucleotide phosphorylase is necessary for competence development in Bacillus subtilis. Mol. Microbiol. 19, 343-356. 155. Ma, G.T., Hong, Y.S. and Ives, D.H. (1995) Cloning and expression of the heterodimeric deoxyguanosine kinase/deoxyadenosine kinase of Lactobacillus acidophilus R-26. J. Biol. Chem. 270, 6595-6601.

220

156. Madden, L.V. and Nault, L.R. (1983) Differential pathogenicity of corn stunting mollicutes to leafhopper vectors in Dalbulus and Baldulus species. Phytopathology 73, 1608-1614. 157. Mahan, M.J., Slauch, J.M. and Mekalanos, J.J. (1993) Selection of bacterial virulence genes that are specifically induced in host tissues. Science 259, 686-688. 158. Mahillon, J. and Chandler, M. (1998) Insertion sequences. Microbiol. Mol. Biol. Rev. 62, 725-774. 159. Maniloff, J. (1996) The minimal cell genome: "on being the right size." Proc. Natl. Acad. Sci. USA 93, 10004-10006. 160. Mantsala, P. and Zalkin, H. (1992) Cloning and sequence of Bacillus subtilis purA and guaA, involved in the conversion of IMP to AMP and GMP. J. Bacteriol. 174, 1883-1890. 161. Marcara, I.G. (2001) Transport into and out of the nucleus. Microbiol. Mol. Biol. Rev. 65, 570-594. 162. Marchler-Bauer, A., Anderson, J.B., DeWeese-Scott, C., Fedorova, N.D., Geer, L.Y., He, S., Hurwitz, D.I., Jackson, J.D., Jacobs, A.R., Lanczycki, C.J., Liebert, C.A., Liu, C., Madej, T., Marchler, G.H., Mazumder, R., Nikolskaya, A.N., Panchenko, A.R,, Rao, B.S., Shoemaker, B.A., Simonyan, V., Song, J.S., Thiessen, P.A., Vasudevan, S., Wang, Y., Yamashita, R.A., Yin, J.J. and Bryant, S.H. (2003) CDD: a curated Entrez database of conserved domain alignments. Nucleic Acids Res. 31, 383-387. 163. Marcone, C. and Seemüller, E. (2001) A chromosome map of the European stone fruit yellows phytoplasma. Microbiology 147, 1213-1221. 164. Marcone, C., Neimark, A., Ragozzino, A., Lauer, U. and Seemüller, E. (1999) Chromosome sizes of phytoplasmas composing major phylogenetic groups and subgroups. Phytopathology 89, 805-810. 165. Markham, P.G. (1983) Spiroplasmas in leafhoppers: a review. Yale J. Biol. Med. 56, 745-751. 166. Masepohl, B., Angermuller, S., Hennecke, S., Hubner, P., Moreno- Vivian, C. and Klipp, W. (1993) Nucleotide sequence and genetic analysis of the Rhodobacter capsulatus ORF6-nifUI SVW gene region: possible role of NifW in homocitrate processing. Mol. Gen. Genet. 238, 369-382.

221

167. Melcher, U., Sha, Y., Ye, F. and Fletcher, J. (1999) Mechanisms of spiroplasma genome variation associated with SpV1-like viral DNA inferred from sequence comparisons. Microbiol. Comp. Genomics 4, 29-46. 168. Menestrina, G., Dalla Serra, M., Comai, M., Coraiola, M., Viero, G., Werner, S., Colin, D.A., Monteil, H. and Prevost, G. (2003) Ion channels and bacterial infection: the case of beta-barrel pore-forming protein toxin of Staphylococcus aureus. FEBS Lett. 552, 54-60. 169. Minion, F.C., Lefkowitz, E.J., Madsen, M.L., Cleary, B., Swartzell, S. and Mahairas, G.G. (2004) The genome sequence of Mycoplasma hyopneumoniae strain 232, the agent of swine mycoplasmosis. J. Bacteriol. In press. 170. Miyata, M. and Petersen, J.D. (2004) Spike structure at the interface between gliding Mycoplasma mobile cells and glass surfaces visualized by rapid-freeze-andfracture electron microscopy. J. Bacteriol. 186, 4382-4386. 171. Miyata, M., Ryu, W. and Berg, H.C. (2002) Force and velocity of Mycoplasma mobile. J. Bacteriol. 184, 1827-1832. 172. Morowitz, H.J. and Tourtellotte, M.E. (1962) The smallest living cells. Sci. Am. 206, 117-126. 173. Morton, T.M., Eaton, D.M., Johnston, J.L. and Archer, G.L. (1993) DNA sequence and units of transcription of the conjugative transfer gene complex (trs) of Staphylococcus aureus plasmid pG01. J. Bacteriol. 175, 4436-4447. 174. Murral, D.J. (1994) M. S. thesis, The Ohio State University. 175. Murray, N.E. (2000) Type I restriction systems: sophisticated molecular machines (a legacy of Bertani and Weigle). Mirobiol. Mol. Biol. Rev. 64, 412-434. 176. Mushegian, A.R. and Koonin, E.V. (1996) A minimal gene set for cellular life derived by comparison of complete bacterial genomes. Proc. Natl. Acad. Sci. USA. 93, 10268-10273. 177. Nakai, K. and Horton, P. (1999) pSORT: a program for detecting sorting signals in proteins and predicting their subcellular localization. Trends Biochem. Sci. 24, 3436. 178. Narita, S., Tanaka, K., Matsuyama, S. and Tokuda, H. (2002) Disruption of lolCDE, encoding an ATP-Binding Cassette transporter, is lethal for Escherichia coli and prevents release of lipoprotein from the inner membrane. J. Bacteriol. 184, 1417-1422. 222

179. Nault, L.R. (1980) Maize bushy stunt and corn stunt: a comparison of disease symptoms, pathogen host ranges and vectors. Phytopathology 70, 709-712. 180. Nault, L.R. (1997) Arthropod transmission of plant viruses: A new synthesis. Ann. Entomol. Soc. Am. 90, 521-541. 181. Nault, L.R., Madden, L.V., Styer, W.E., Triplehorn, B.W., Shambaugh, G.F. and Heady, S.E. (1984) Pathogenicity of corn stunt spiroplasma and maize bushy stunt mycoplasma to their vector, Dalbulus longulus. Phytopathology 74, 977-979. 182. Neimark, H. and Kirkpatrick, B.C. (1993) Isolation and characterization of fulllength chromosomes from non-culturable plant-pathogenic Mycoplasma-like organisms. Mol. Microbiol. 7, 21-28. 183. Nielsen, H., Engelbrecht, J., Brunak, S. and von Heijne, G. (1997a) Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Protein Eng. 10, 1-6. 184. Nielsen, H., Engelbrecht, J., Brunak, S. and von Heijne, G. (1997b) A neural network method for identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Int. J. Neural. Sys. 8, 581-599. 185. Nies, D.H. (2003) Efflux-mediated heavy metal resistance in prokaryotes. FEMS Microbiol. Rev. 27, 313-339. 186. Niittylä, T., Messerli, G., Trevisan, M., Chen, J., Smith, A.M. and Zeeman, S.C. (2004) A previously unknown maltose transporter essential for starch degradation in leaves. Science 303, 87-89. 187. Nishio, K. and Nakai, M. (2000) Transfer of iron-sulfur cluster from NifU to apoferredoxin. J. Biol. Chem. 275, 22615-226158. 188. Novak, R., Braun, J.S., Charpentier, E. and Tuomanen, E. (1998) Penicillin tolerance genes of Streptococcus pneumoniae: the ABC-type manganese permease complex Psa. Mol. Microbiol. 29, 1285-1296. 189. Okinaka, R., Cloud, K., Hampton, O., Hoffmaster, A., Hill, K., Keim, P., Koehler, T., Lamke, G., Kumano, S., Manter, D., Martinez, Y., Ricke, D., Svensson, R. and Jackson, P. (1999) Sequence, assembly and analysis of pX01 and pX02. J. Appl. Microbiol. 87, 261-262.

223

190. Oshima, K., Kakizawa, S., Nishigawa, H., Jung, H.-Y., Wei, W., Suzuki, S., Arashida, R., Nakata, D., Miyata, S., Ugaki, M. and Namba, S. (2004) Reductive evolution suggested from the complete genome sequence of a plant-pathogenic phytoplasma. Nat. Genet. 36, 27-29. 191. Oshima, K., Miyata, S., Sawayanagi, T., Kakizawa, S., Nishigawa, H., Jung, H.-Y., Furuki, K., Yanazaki, M., Suzuki, S., Wei, W., Kuboyama, T., Ugaki, M. and Namba, S. (2002) Minimal set of metabolic pathways suggested from the genome of Onion Yellows phytoplasma. J. Gen. Plant Pathol. 68, 225-236. 192. Osipiuk, J., Gornicki, P., Maj, L., Dementieva, I., Laskowski, R. and Joachimiak, A. (2001) Streptococcus pneumonia YlxR at 1.35 A shows a putative new fold. Acta Crystallogr. D Biol. Crystallogr. 57, 1747-1751. 193. Oussenko, I.A. and Bechhofer, D.H. (2000) The yvaJ gene of Bacillus subtilis encodes a 3P-to-5P exoribonuclease and is not essential in a strain lacking polynucleotide phosphorylase. J. Bacteriol. 182, 2639- 2642. 194. Ouzounis, C., Bork, P. and Sander, C. (1994) The modular structure of NifU proteins. Trends Biochem. Sci. 19, 199-200. 195. Özbek, E., Miller, S.A., Meulia, T. and Hogenhout, S.A. (2003) Infection and replication sites of Spiroplasma kunkelii (Class: Mollicutes) in midgut and Malpighian tubules of the leafhopper Dalbulus maidis. J. Invertebr. Pathol. 82, 167-175. 196. Padovan, A.C., Firrao, G., Schneider, B. and Gibb, K.S. (2000) Chromosome mapping of the sweet potato little leaf phytoplasma reveals genome heterogeneity within the phytoplasmas. Microbiology 146, 893-902. 197. Papazisi, L., Gorton, T.S., Kutish, G., Markham, P.F., Browning, G.F., Nguyen, D.K., Swartzell, S., Madan, A., Mahairas, G. and Geary, S.J. (2003) The complete genome sequence of the avian pathogen Mycoplasma gallisepticum strain R(low). Microbiology (Reading, Engl.) 149, 2307-2316. 198. Piper, B., Rosengarten, R. and Kirchhoff, H. (1987) The influence of various substance on the gliding motility of Mycoplasma mobile 163K. J. Gen. Microbiol. 133, 3193-3198. 199. Postma, P.W., Lengeler, J.W. and Jacobson, G.R. (1993) Phosphoenolpyruvate:carbohydrate phosphotransferase systems of bacteria. Microbiol. Rev. 57, 543-594.

224

200. Purcell, A.H. (1982) Insect vector relationship with procaryotic plant pathogens. Annu. Rev. Phytopathol. 20, 397-417. 201. Purcell, A.H. (1988) Increased survival of Dalbulus maidis Delong & Wolcott, a specialist on maize, on non-host plants infected with mollicute plant pathogens. Entomol. Exp. Appl. 46, 187-196. 202. Purcell, A.H. and Nault, L.R. (1991) Interactions among plant pathogenic prokaryotes, plants, and insect vectors. In: Microbial Mediation of Plant-Herbivore Interactions (Barbosa, P., Krischik, V.A. and Jones, C.G. Ed.) pp. 383-405. John Wiley & Sons, Inc. Indianapolis, IN. 203. Qutob, D., Kamoun, S. and Gijzen, M. (2002) Expression of a Phytophthora sojae necrosis-inducing protein occurs during transition from biotrophy to necrotrophy. Plant J. 32, 361-373. 204. Ratcliff, F., Martin-Hernandez, A.M. and Baulcombe, D.C. (2001) Tobacco rattle virus as a vector for analysis of gene function by silencing. Plant J. 25, 237-245. 205. Rather, P.N., Solinsky, K.A., Paradise, M.R. and Parojcic, M.M. (1997) aarC, an essential gene involved in density-dependent regulation of the 2P-Nacetyltransferase in Providencia stuartii. J. Bacteriol. 179, 2267-2273. 206. Razin, S. (1978) The mycoplasmas. Microbiol. Rev. 42, 414-470. 207. Razin, S. (1994) DNA probes and PCR in diagnosis of mycoplasma infections. Mol. Cell. Probes 8, 497-511. 208. Razin, S., Yogev, D. and Naot, Y. (1998) Molecular biology and pathogenicity of mycoplasmas. Microbiol. Mol. Biol. Rev. 62, 1094-1156. 209. Renaudin, J. (2002) Extrachromosomal elements and gene transfer. In: Molecular Biology and Pathogenicity of Mycoplasmas (Razin, S. and Herrmann, R. Ed.). pp. 347-370. Kluwer Academic/Plenum, New York, NY. 210. Renaudin, J., Marais, A., Verdin, E., Duret, S., Foissac, X., Laigret, F. and Bové, J.M. (1995) Integrative and free Spiroplasma citri oriC plasmids: expression of the Spiroplasma phoeniceum spiralin in Spiroplasma citri. J. Bacteriol. 177, 28002877. 211. Rice, P.A. and Baker, T.A. (2001) Comparative architecture of transposase and integrase complexes. Nat. Struct. Biol. 8, 302-307.

225

212. Riley, M. (1993) Functions of the gene products of Escherichia coli. Microbiol. Rev. 57, 862-952. 213. Rosch, J. and Caparon, M. (2004) A microdomain for protein secretion in Grampositive bacteria. Science 304, 1513-1515. 214. Ruiz, M.T., Voinnet, O. and Baulcombe, D.C. (1998) Initiation and maintenance of virus-induced gene silencing. Plant Cell 10, 937-946. 215. Saglio, P., L'hospital, M., Lafleche, D., Dupont, G., Bové, J.M., Tully, J.G. and Freundt, E.A. (1973) Spiroplasma citri gen. and sp. n.: a mycoplasma-like organism associated with 'stubborn' disease of citrus. Int. J. Syst. Bacteriol. 23, 191-204. 216. Saillard, C., Vignault, J.C., Bové, J.M., Raie, A., Tully, J.G., willismdon, D.L., Fos, A., Garnier, M., Gadeau, A., Carle, P. and Whitecomb, R.F. (1987) Spiroplasma phoeniceum sp. nov., a new plant-pathogenic species from Syria. Int. J. Syst. Bacteriol. 37,106-115. 217. Sambrook, J., Fritsch, E.F. and Maniatis, T. (1989) Molecular Cloning: a Laboratory Manual, 2nd ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. 218. Sasaki, Y., Ishikawa, J., Yamashita, A., Oshima, K., Kenri, T., Furuya, K., Yoshino, C., Horino, A., Shiba, T., Sasaki, T. and Hattori, M. (2002) The complete genomic sequence of Mycoplasma penetrans, an intracellular bacterial pathogen in humans. Nucleic Acids Res. 30, 5293-5300. 219. Schneider, B., Gibb, K.S. and Seemüller, E. (1997) Sequence and RFLP analysis of the elongation factor Tu gene used in differentiation and classification of phytoplasmas. Microbiology 143, 3381-3389. 220. Schomburg, I., Chang, A., Ebeling, C., Gremse, M., Heldt, C., Huhn, G. and Schomburg, D. (2004) BRENDA, the enzyme database: updates and major new developments. Nucleic Acids Res. 32, D431-D433. 221. Sears, B.B., Klomparens, K.L., Wood, J.I. and Schewe, G. (1997) Effect of altered levels of oxygen and carbon dioxide on phytoplasma abundance in Oenothera leaftip cultures. Physiol. Mol. Plant Pathol. 50, 275-287. 222. Seemüller, E., Garnicer, M. and Schneider, B. (2002) Mycoplasmas of plants and insects. In: Molecular biology and pathogenicity of mycoplasmas (Razin, S. and Herrmann, R. Ed.). vol. 1, pp. 91-115. Kluwer Academic/Plenum Publishers, New York, NY. 226

223. Selkov, E., Overbeek, R., Kogen, Y., Chu, L., Vonstein, V., Holmes, D., Silver, S., Haselkorn, R. and Fonstein, M. (2000) Functional analysis of gapped microbial genomes: Amino acid metabolism of Thiobacillus ferrooxidans. Proc. Natl. Acad. Sci. USA 97, 3509-3514. 224. Simpson, A.J., Reinach, F.C., Arruda, P., Abreu, F.A., Acencio, M., Alvarenga, R., Alves, L.M., Araya, J.E., Baia, G.S., Baptista, C.S., Barros, M.H., Bonaccorsi, E.D., Bordin, S., Bové, J.M., Briones, M.R., Bueno, M.R., Camargo, A.A., Camargo, L.E., Carraro,D.M., Carrer, H., Colauto, N.B., Colombo, C., Costa, F.F., Costa, M.C., Costa-Neto, C.M., Coutinho, L.L., Cristofani, M., Dias-Neto, E., Docena, C., El-Dorry, H., Facincani, A.P., Ferreira, A.J., Ferreira, V.C., Ferro, J.A., Fraga, J.S., Franca, S.C., Franco, M.C., Frohme, M., Furlan, L.R., Garnier, M., Goldman, G.H., Goldman, M.H., Gomes, S.L., Gruber, A., Ho, P.L., Hoheisel, J.D., Junqueira, M.L., Kemper, E.L., Kitajima, J.P., Krieger, J.E., Kuramae, E.E., Laigret, F., Lambais, M.R., Leite, L.C., Lemos, E.G., Lemos, M.V., Lopes, S.A., Lopes, C.R., Machado, J.A., Machado, M.A., Madeira, A.M., Madeira, H.M., Marino, C.L., Marques, M.V., Martins, E.A., Martins, E.M., Matsukuma, A.Y., Menck, C.F., Miracca, E.C., Miyaki, C.Y., Monteriro-Vitorello, C.B., Moon, D.H., Nagai, M.A., Nascimento, A.L., Netto, L.E., Nhani Jr., A., Nobrega, F.G., Nunes, L.R., Oliveira, M.A., de Oliveira, M.C., de Oliveira, R.C., Palmieri, D.A., Paris, A., Peixoto, B.R., Pereira, G.A., Pereira Jr., H.A., Pesquero, J.B., Quaggio, R.B., Roberto, P.G., Rodrigues, V., de, M.R.A.J., de Rosa Jr., V.E., de Sa, R.G., Santelli, R.V.,Sawasaki, H.E., da Silva, A.C., da Silva, A.M., da Silva, F.R., da Silva Jr., W.A., da Silveira, J.F., Silvestri, M.L., Siqueira, W.J., de Souza, A.A., de Souza, A.P., Terenzi, M.F., Tru., D., Tsai, S.M., Tsuhako, M.H., Vallada, H., Van Sluys, M.A., Verjovski-Almeida, S., Vettore, A.L., Zago, M.A., Zatz, M., Meidanis, J. and Setubal, J.C. (2000) The genome sequence of the plant pathogen Xylella fastidiosa. Nature 406, 151-157. 225. Singer, S., Ferone, R., Walton, L. and Elwell, L. (1985) Isolation of a dihydrofolate reductase-deficient mutant of Escherichia coli. J. Bacteriol. 164, 470-472. 226. Smart, C.D., Schneider, B., Blomquist, C.L., Guerra, L.J., Harrison, N.A., Ahrens, U., Lorenz, K.-H., Seemüller, E. and Kirkpatrick, B.C. (1996) Phytoplasmaspecific PCR primers based on sequences of the 16S-23S rRNA spacer region. Appl. Environ. Microbiol. 62, 2988-2993. 227. Sonnhammer, E.L.L., von Heijine, G. and Krogh, A. (1998) A hidden Markov model for predicting transmembrane helices in protein sequences, pp. 175-182. In E. J. Glasgow, T. Littlejohn, F. Major, R. Lathrop, D. Sankoff, and C. Sensen (ed.), Proceedings of Sixth International Conference on Intelligent Systems for Molecular Biology. Menlo Park, CA: AAAI Press.

227

228. Swofford, D. (2001) PAUP* 4.0. Sinauer Associates. 229. Tabor, C.W. and Tabor, H. (1984) Polyamines. Annu. Rev. Biochem. 53, 749-790. 230. Takken, F.L., Luderer, R., Gabriels, S.H., Westerink, N., Lu, R., de Wit, P.J. and Joosten, M.H. (2000) A functional cloning strategy, based on a binary PVXexpression vector, to isolate HR-inducing cDNAs of plant pathogens. Plant J. 2000 24, 275-283. 231. Tang, X., Frederick, R.D., Zhou, J., Halterman, D.A., Jia, Y. and Martin, G.B. (1996) Initiation of plant disease resistance by physical interaction of AvrPto and Pto kinase. Science 274, 2060-2062. 232. Thiaucourt, F., Lorenzon, S., David, A. and Breard, A. (2000) Phylogeny of the Mycoplasma mycoides cluster as shown by sequencing of a putative membrane protein gene. Vet. Microbiol. 72, 251-268. 233. Thompson, D.V., Melchers, L.S., Idler, K.B., Schilperoort, R.A. and Hooykaas, P.J. (1988) Analysis of the complete nucleotide sequence of the Agrobacterium tumefaciens virB operon. Nucleic Acids Res. 16, 4621-4636. 234. Thompson, J.D., Gibson, T.J., Plewniak, F., Jeanmougin, F. and Higgins, D.G. (1997) The ClustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 24, 4876-4882. 235. Thompson, J.D., Higgins, D.G. and Gibson, T.J. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positions-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673-4680. 236. Torto, T.A., Li, S., Styer, A., Huitema, E., Testa, A., Gow, N.A., van West, P. and Kamoun, S. (2003) EST mining and functional expression assays identify extracellular effector proteins from the plant pathogen Phytophthora. Genome Res. 13, 1675-1685. 237. Torto, T.A., Rauser, L. and Kamoun, S. (2002) The pipg1 gene of the oomycete Phytophthora infestans encodes a fungal-like endopolygalacturonase. Curr. Genet. 40, 385-390. 238. Tourancheau, A.B., Morin, L., Yang, T. and Perasso, R. (1999) Messenger RNA in dormant cells of Sterkiella histriomuscorum (Oxytrichiade): identification of putative regulatory gene transcripts. Protist 150, 137-147.

228

239. Trachtenberg, S. (1998) Mollicutes - wall-less bacteria with internal cytoskeletons. J. Struct. Biol. 124, 244-256. 240. Trachtenberg, S. (2004) Shaping and moving a spiroplasma. J. Mol. Microbiol. Biotechnol. 7, 78-87. 241. Trachtenberg, S. and Gilad, R. (2001) A bacterial linear motor: cellular and molecular organization of the contractile cytoskeleton of the helical bacterium Spiroplasma melliferum BC3. Mol. Microbiol. 41, 827-848. 242. Trachtenberg, S., Gilad, R. and Geffen, N. (2003) The bacterial linear motor of Spiroplasma melliferum BC3: from single molecules to swimming cells. Mol. Microbiol. 47, 671-697. 243. Tsai, J.H. (1979) Vector transmission of mycoplasmal agents of plant diseases. In: The Mycoplasmas (Whitcomb, R.F. and Tully, J.G. Ed.), Vol. III, Plant and Insect Mycoplasmas. pp. 265-307. Academic Press, New York, NY. 244. Tsiamis, G., Mansfield, J.W., Hockenhull, R., Jackson, R.W., Sesma, A., Athanassopoulos, E., Bennett, M.A., Stevens, C., Vivian, A., Taylor, J.D. and Murillo, J. (2000) Cultivar-specific avirulence and virulence functions assigned to avrPphF in Pseudomonas syringae pv. phaseolicola, the cause of bean halo-blight disease. EMBO J. 19, 3204-3214. 245. Uenoyama, A., Kusumoto, A. and Miyata, M. (2004) Identification of a 349kilodalton protein (Gli349) responsible for cytadherence and glass binding during gliding of Mycoplasma mobile. J. Bacteriol. 186, 1537-1545. 246. Van den Ackerveken G., Marois, E. and Bonas, U. (1996) Recognition of the bacterial avirulence protein AvrBs3 occurs inside the host plant cell. Cell 87, 13071316. 247. van den Ent, F., Amos, L.A. and Lowe, J. (2001) Prokaryotic origin of the actin cytoskeleton. Nature 413, 39-44. 248. Vanin, E.F. (1985) Processed pseudogene: Characteristics and evolution. Annu. Rev. Genet. 19, 253-272. 249. von Heijne, G. (1985) Signal sequences: The limits of variation. J. Mol. Biol. 184, 99-105. 250. Walsh, C. and Cepko, C.L. (1992) Widespread dispersion of neuronal clones across functional regions of the cerebral cortex. Science 255, 434-440.

229

251. Walters, D.R. (2003) Polyamine and plant disease. Phytochemistry 64, 97-107. 252. Wang, W. and Bechhofer, D.H. (1996) Properties of a Bacillus subtilis polynucleotide phosphorylase deletion strain. J. Bacteriol. 178, 2375- 2382. 253. Wayadande, A.C., Shaw, M.E. and Fletcher, J. (1993) Tests of differential transmission of three Spiroplasma citri lines by the leafhopper, Circulifer tenellus. Phytopathology 83, 468. 254. Weiner III, J., Herrmann, R. and Browning, G.F. (2000) Transcription in Mycoplasma pneumoniae. Nucleic Acids Res. 28, 4488-4496. 255. Weisburg, W.G., Tully, J.G., Rose, D.L., Petzel, J.P., Oyaizu, H., Yang, D., Mandelco, L., Sechrest, J., Lawrence, T.G., van Etten, J., Maniloff, J. and Woese, C.R. (1989) A phylogenetic analysis of the mycoplasmas: basis for their classification. J. Bacteriol. 171, 6455-6467. 256. Westberg, J., Persson, A., Holmberg, A., Goesmann, A., Lundeberg, J., Johansson, K.E., Pettersson, B. and Uhlen, M. (2004) The genome sequence of Mycoplasma mycoides subsp. mycoides SC type strain PG1T, the causative agent of contagious bovine pleuropneumonia (CBPP). Genome Res. 14, 221-227. 257. Wheeler, D.L., Church, D.M., Federhen, S., Lash, A.E., Madden, T.L., Pontius, J.U., Schuler, G.D., Schriml, L.M., Sequeira, E., Tatusova, T.A. and Wagner, L. (2003) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 31, 28-33. 258. Whitecomb, R.F., Chen, T.A., Williamson, D.L., Liao, C., Tully, J.G., Bové, J.M., Mouches, C., Rose, D.L., Coan, M.E. and Clark, T.B. (1986) Spiroplasma kunkelii sp. nov.: characterization of the etiological agent of corn stunt disease. Int. J. Syst. Bacteriol. 36, 170-178. 259. Williamson, D.L., Brink, P.R. and Zieve, G.W. (1984) Spiroplasma fibrils. Isr. J. Med. Sci. 20, 830-835. 260. Williamson, D.L., Renaudin, J. and Bové, J.M. (1991) Nucleotide sequence of the Spiroplasma citri fibril protein gene. J. Bacteriol. 173, 4353-4362. 261. Woese, C.R. (1987) Bacterial evolution. Microbiol. Rev. 51, 221-271. 262. Wolf, M., Muller, T., Dandekar, T. and Pollack, J.D. (2004) Phylogeny of Firmicutes with special reference to Mycoplasma (Mollicutes) as inferred from phosphoglycerate kinase amino acid sequence data. Int. J. Syst. Evol. Microbiol. 54, 871-875. 230

263. Wolgemuth, C.W., Igoshin, O. and Oster, G. (2003) The motility of mollicutes. Biophys. J. 85, 828-842. 264. Wright, F. (1990) The 'effective number of codons' used in a gene. Gene 87, 23-29. 265. Yang, Y. and Gabriel, D.W. (1995) Xanthomonas avirulence/pathogenicity gene family encodes functional plant nuclear targeting signals. Mol. Plant Microbe Interact. 8, 627-631. 266. Ye, F., Laigret, F. and Bové, J.M. (1994) A physical and genomic map of the prokaryote Spiroplasma melliferum and its comparison with the Spiroplasma citri map. C. R. Acad. Sci. 317, 392-398. 267. Ye, F., Laigret, F., Carle, P. and Bové, J.M. (1995) Chromosomal heterogeneity among various strains of Spiroplasma citri. Int. J. Syst. Bacteriol. 45, 729-734. 268. Ye, F., Melcher, U., Rascoe, J.E. and Fletcher, J. (1996) Extensive chromosome aberrations in Spiroplasma citri strain BR3. Biochem. Genet. 34, 269-286. 269. Ye, F., Renaudin, J., Bové, J.M. and Laigret, F. (1994) Cloning and sequencing of the replication origin (oriC) of the Spiroplasma citri chromosome and construction of autonomously replicating artificial plasmids. Curr. Microbiol. 29, 23-29. 270. Yu, J., Wayadande, A.C. and Fletcher, J. (2000) Spiroplasma citri surface protein P89 implicated in adhesion to cells of the vector Circulifer tenellus. Phytopathology 90, 716-722. 271. Zatyka, M. and Thomas, C.M. (1998) Control of genes for conjugative transfer of plasmids and other mobile elements. FEMS Microbiol. Rev. 21, 291-319. 272. Zhang, J., Hogenhout, S.A., Nault, L.R., Hoy, C.W. and Miller, S.A. (2004) Molecular and symptom analyses of phytoplasma strains from lettuce reveal a diverse population. Phytopathology 94, 842-849. 273. Zhang, Q., Soares de Oliveira, S., Colangeli, R. and Gennaro, M.L. (1997) Binding of a novel host factor to the pT181 replication enhancer. J. Bacteriol. 23, 191-204. 274. Zhao, Y., Hammond, R.W., Jomantiene, R., Dally, E.L., Lee, I.M., Jia, H., Wu, H., Lin, S., Zhang, P., Kenton, S., Najar, F.Z., Hua, A., Roe, B.A., Fletcher, J. and Davis, R.E. (2003) Gene content and organization of an 85-kb DNA segment from the genome of the phytopathogenic mollicute Spiroplasma kunkelii. Mol. Genet. Genomics. 269, 592-602.

231

275. Zhao, Y., Hammond, R.W., Lee, I.M., Roe, B.A., Lin, S. and Davis, R.E. (2004) Cell division gene cluster in Spiroplasma kunkelii: functional characterization of ftsZ and the first report of fstA in mollicutes. DNA Cell Biol. 23, 127-134. 276. Zhao, Y., Wang, H., Hammond, R.W., Jomantiene, R., Liu, Q., Lin, S., Roe, B.A. and Davis, R.E. (2004) Predicted ATP-binding cassette systems in the phytopathogenic mollicute Spiroplasma kunkelii. Mol. Gen. Genomics 271, 325338. 277. Zhong, B.X. and Shen, Y.W. (2004) Accumulation of pathogenesis-related type-5 like proteins in phytoplasma-infected garland chrysanthemum Chrysanthemum coronarium. Acta. Biochim. Biophys. Sin. (Shanghai) 36, 773-779. 278. Zuo, Y. and Deutscher, M.P. (2001) Exoribonuclease superfamilies: structural analysis and phylogenetic distribution. Nucleic Acids Res. 29, 1017-1026.

232