RNA RECOGNITION: CONTROLLING RNA-PROTEIN COMPLEXES WITH SMALL MOLECULES

RNA RECOGNITION: CONTROLLING RNA-PROTEIN COMPLEXES WITH SMALL MOLECULES BY SREENIVASA RAO RAMISETTY DISSERTATION Submitted in partial fulfillment of...
3 downloads 4 Views 5MB Size
RNA RECOGNITION: CONTROLLING RNA-PROTEIN COMPLEXES WITH SMALL MOLECULES

BY SREENIVASA RAO RAMISETTY

DISSERTATION Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Chemistry in the Graduate College of the University of Illinois at Urbana-Champaign, 2010

Urbana, Illinois

Doctoral Committee: Associate Professor Anne M. Baranger, Chair Professor Steven C. Zimmerman Professor John A. Katzenellenbogen Associate Professor Paul J. Hergenrother

A

ABSTRACT PART I. Investigation of Small Molecule Binding to an RNA Hairpin Loop Containing a Dangling End PART II. Unraveling the Interaction of Pathogenic RNAs with the MBNL1 Protein and Complex Inhibition by Small Molecules PART I. RNA plays important and versatile roles in gene expression by both carrying and regulating the information used to direct protein synthesis. Therefore, small molecules able to bind to RNA and alter these biological processes would be of great utility. This part of my thesis describes the virtual screening and identification of a quinoline derivative binding cooperatively to a GCAA RNA tetraloop containing a 3’ dangling end (tGCAA). The compound NSC5485 (QD2) was identified by performing a similarity search of the NCI database of 250,000 compounds and using the program AutoDock 3. Fluorescence and ITC experiments revealed that QD2 binds cooperatively to four identical binding sites on tGCAA RNA hairpin. The equilibrium binding dissociation constant of the four identical binding sites is 8.2 (±0.4) µM. CD spectroscopy and UV titration experiments suggested that binding of QD2 changes the conformation of RNA and perturbs the QD2 chromophore. PART II. Trinucleotide repeat expansions are the genetic cause of numerous human diseases, including Huntington’s disease, Fragile X mental retardation, and myotonic dystrophy type 1. Myotonic dystrophy (DM1 and DM2) is an autosomal dominant neuromuscular disorder associated with a (CTG)n and (CCTG)n expansion in the 3’-untranslated region of the Dystrophia Myotonica protein kinase (DMPK) gene. The disease is characterized by a waning of the muscles (muscular dystrophy), eye-lens opacity and myotonia. The pathogenic poly(CUG)RNA and poly(CCUG)RNA binds to and sequesters key proteins, such as MBNL1 (muscleblind-like

ii

protein 1), preventing them from regulating proper splicing of different pre-mRNAs. The severity of disease correlates with the length of the repeat tract in peripheral blood. The first part of this project is about investigating the interaction of the MBNL1N protein with poly(CUG)RNA. We are interested in identifying important amino acids or zinc finger domains involved in recognition of MBNL1N protein to poly(CUG)RNA. To address this question we did alanine scanning for six amino acids and expressed truncated versions of the protein and studied their interaction with MBNL1N protein by gel-shift assays. In the second part, the inhibition of complexes formed between the toxic poly(CUG)RNA or poly(CCUG) RNA with MBNL1 protein by a small molecules has been shown by gel-shift assays. We identified small molecules containing triaminotriazine-acridine and triaminopyrimidine-acridine conjugates which can specifically inhibit (CUG)12 and (CCUG)6 complexes with MBN1N protein, respectively. Thus the compounds triaminotriazine-acridine and triaminopyrimidineacridine conjugates are potential lead compounds for targeting DM1 and DM2, respectively.

iii

To my parents and wife

iv

ACKNOWLEDGMENTS It is with immense pleasure that I take this opportunity to express my sincere thanks to my research advisor, Professor Anne M. Baranger, for her encouragement, proficient guidance, suggestions and kind co-operation during my graduate career. I would also like to thank my committee members Prof. Paul Hergenrother, Prof. Steven C. Zimmerman and Prof. John A. Katzenellenbogen for their support. I would like to thank all the members of Baranger group for their support and encouragement. Thanks to Douglas Warui, Divina Anunciado and Dr. Yan Fan for their friendship and support. I want to express my sincere thanks to Dr. Zhaohui Yan for introducing me the part I research project and also for showing me computational docking and fluorescence experiments. My special thanks to Dr. Douglas Warui who encouraged me when ever I had hard time in running the project and for useful discussions for most of the experiments. I would also like to thank Dr. Jonathan F. Arambula, Chun-Ho Wong and Yuan Fu for their efforts in success of MBNL project and helpful discussions about research. I am grateful to my family members, for their love and support during the entire course of my studies. Finally thank to my love, my wife Mansu, who supported me even though she has been going through tough time in her life. Her support made me to finish my project on time.

v

TABLE OF CONTENTS PART I. Investigation of Small Molecule Binding to an RNA Hairpin Loop Containing a Dangling End Chapter 1 RNA as Target for Small Molecules...................................................................................1 1.1 Introduction..............................................................................................................................1 1.2 RNA structural motifs .............................................................................................................1 1.3 RNA binding pockets for small molecule binding ................................................................3 1.4 Molecular recognition of RNA by small molecules...............................................................3 1.5 RNA-small molecule binding ..................................................................................................4 1.5.1 Duplex binders ................................................................................................................5 1.5.2 External loop or Hairpin binders ..................................................................................7 1.5.3 Internal loop binders ......................................................................................................9 1.5.4 Bulge binders.................................................................................................................11 1.6 Affinity vs specificity..............................................................................................................12 1.7 Conclusion ..............................................................................................................................13 1.8 References ...............................................................................................................................14

Chapter 2 Cooperative Binding of a Quinoline Derivative to an RNA Hairpin Loop Containing a Dangling End .................................................................................................21 2.1 Introduction............................................................................................................................21 2.2 Results and discussion ...........................................................................................................22 2.2.1 Selective recognition of 3’-dangling end of a hairpin by QD1 ..................................22 2.2.2 Isothermal titration calorimetry assays ......................................................................23 2.2.3 Virtual screening of NCI database ..............................................................................25 2.2.4 Screening of molecules by fluorescence assay ............................................................28 2.2.5 Stoichiometry of binding ..............................................................................................31 2.2.6 Isothermal titration calorimetry experiment .............................................................32 2.2.7 CD spectroscopy............................................................................................................34 2.2.8 UV reverse titration experiment..................................................................................34 2.2.9 Binding specificity of QD2 molecule ...........................................................................35 2.3 Conclusion ..............................................................................................................................36 2.4 Experimental section .............................................................................................................37 2.4.1 Computational studies using AutoDock......................................................................37 2.4.2 Materials and methods .................................................................................................37 2.4.3 RNA purification by denaturing PAGE .....................................................................38 2.4.4 Fluorescence experiments ............................................................................................39 2.4.5 Isothermal titration calorimetry experiments............................................................40 2.4.6 CD spectroscopy............................................................................................................41 2.4.7 UV reverse titration experiment..................................................................................41 2.5 References ...............................................................................................................................42 vi

PART II. Unraveling the Interaction of Pathogenic RNAs with the MBNL1 Protein and Complex Inhibition by Small Molecules CHAPTER 3 Unraveling the Interaction of Poly(CUG)RNA with the MBNL1 Protein .........48 3. 1 Significance of RNA-protein interactions...........................................................................48 3. 2 RNA-protein recognition......................................................................................................49 3. 3 RNA binding proteins in human diseases...........................................................................51 3. 4 RNA dominant diseases........................................................................................................51 3. 5 Myotonic dystrophies............................................................................................................53 3. 6 Molecular genetics of DM1 and DM2 .................................................................................54 3. 7 Clinical features of the myotonic dystrophies ....................................................................54 3. 8 RNA pathogenesis in myotonic dystrophy..........................................................................56 3. 9 Misregulation of alternative splicing...................................................................................57 3.10 Research aims.......................................................................................................................59 3.11 Introduction..........................................................................................................................60 3.12 Results and discussion .........................................................................................................63 3.12.1 MBNL1N interaction with poly(CUG)RNA repeats and natural target RNAs....63 3.12.2 Identification of amino acids in MBNL1N that are important for binding to RNA..............................................................................................................................66 3.12.3 Investigation of essential zinc finger domains for recognizing RNA .....................70 3.13 Conclusions...........................................................................................................................72 3.14 Materials and methods ........................................................................................................73 3.14.1 Recombinant protein expression and purification...................................................73 3.14.2 SDS gel electrophoresis...............................................................................................74 3.14.3 Radio labeling of 5’ end of RNA................................................................................81 3.14.4 Equilibrium binding assays........................................................................................81 3.14.5 Site directed mutagenesis (Alanine scanning) ..........................................................82 3.14.6 Molecular cloning........................................................................................................83 3.14.7 Transcription of p(CTG)54 and p(CTG)90 plasmids.................................................84 3.15 References .............................................................................................................................86

Chapter 4 Inhibition of Pathogenic RNAs-MBNL1N Complexes with Small Molecules ..95 4.1 Introduction............................................................................................................................95 4.2 Results and discussion ...........................................................................................................96 4.3 Inhibition of the poly(CUG)RNA-MBNL1N complex........................................................96 4.3.1 Rational design of ligands ............................................................................................96 4.3.2 Screening small molecules for inhibition of the poly(CUG)RNA-MBNL1N complex ..........................................................................................................................98 4.3.3 Binding of melamine-acridine conjugate with RNA mismatches and tRNA ........100 4.3.4 Inhibition of poly(CUG)RNA-MBNL1N complexes................................................101 4.3.5 Specificity of melamine-acridine conjugate (compound 1) .....................................104 4.4 Inhibition of poly(CCUG)RNA-MBNL1N complex .........................................................106 4.4.1 MBNL1N binding with (CCUG)6 RNA.....................................................................106 vii

4.4.2 Ligand design ..............................................................................................................107 4.4.3 Screening of inhibition of the poly(CCUG)RNA-MBNL1N complex by small molecules......................................................................................................................108 4.4.4 Inhibition of the (CCUG)6-MBNL1N complex.........................................................111 4.4.5 Specificity of compounds 5 and 6 ..............................................................................111 4.5 Conclusion ............................................................................................................................113 4.6 Experimental section ...........................................................................................................114 4.6.1 Materials and methods ...............................................................................................114 4.6.2 Inhibition assay ...........................................................................................................114 4.6.3 Isothermal titration calorimetry experiments..........................................................116 4.7 References .............................................................................................................................119

viii

CHAPTER 1 RNA as Target for Small Molecules 1.1 Introduction RNA plays essential roles in gene replication, transcription and translation, which direct protein synthesis in living organisms.1-3 RNA also plays important roles in catalyzing the maturation of mRNAs via ribozymes, and these are highly regulated by various ribonucleoprotein (RNP)-RNA interactions.4-5 RNA has some advantages as a drug target compared to targeting traditional protein targets because of sequence specific binding, selective inhibition, more sites available for interaction, multivalent drug targeting, and difficulty in mutating RNA.5 RNAs fold into various structures that form binding sites for proteins and other RNAs, and are important for the correct functioning of RNA.6-8 New opportunities in the field of drug discovery have been created because of recent advances in the fields of RNA synthesis, structure determination, and therapeutic target identification. The structural flexibility of RNA is useful in creating different possible secondary and tertiary structures and allows structurespecific as well as sequence-specific recognition of RNA by small molecules.9 Small molecules that bind to bacterial ribosomal RNA and inhibit protein synthesis are available as potential antibiotics.10-11 The structural diversity and the importance in the cellular function of RNA may make it a therapeutic target for treating many diseases. Therefore, small molecules that bind to RNA and affect any of these biological processes would be of great utility.12-14

1.2 RNA structural motifs In general, RNA folds in so that complementary sequences form double helices and the possible secondary structures are shown in Figure 1-1. These hairpin loop, bulge, and internal 1

loop secondary structures are useful for the formation of tertiary structures via interactions of preformed structures that precisely present chemical moieties that are essential for the function of RNA as a biological catalyst, translator of genetic information, and structural scaffold.15 Secondary structure motifs can be divided into two classes: local structural motifs and global structural motifs. The local motifs influence RNA structure only in their immediate vicinity but may be involved in tertiary interactions. Conversely, RNA global structural motifs distort the relationship between helices by unwinding and/or bending/kinking or by involvement in tertiary interactions. These tertiary interactions which are formed between distinct secondary structural elements play a dominant role in establishing the global fold of the structure.16 The hydrogen bonding interaction of the 2’-hydroxyl group, base stacking, binding of divalent metal cations, noncanonical base pairing, and back bone topology all serve to stabilize the global structure of RNA. The growth in analyzing the structure of RNA has led to the discovery of many secondary and tertiary motifs, including pseudoknots, ribose zippers, tetraloop motifs, adenine platforms, G-ribo motifs, A minor motifs, base triplexes, and metal core motifs.15, 17-18 These diverse RNA structural motifs may be targets for small molecules.

Duplex

Bulge

Internal loop Stem loop

Figure 1-1. Common RNA secondary structural elements. 2

1.3 RNA binding pockets for small molecule binding In designing small molecules to bind RNA, it is useful to consider the similarities and differences between recognition of RNA and recognition of DNA and proteins. RNA has an Aform helical conformation with a shallow, wide minor groove and a narrow, deep major groove. This structure makes small molecules binding in the major and minor groves of RNA difficult compared to binding DNA. Studies performed by Crothers suggested that the fully base paired A-form helix of the major groove is only 4 0A wide, which precludes much small molecule binding.19 The 2’-hydroxyl group present in the minor groove of RNA results in a puckering of the ribose group, leading to a change in the helix pitch and a tilt of the bases. Phosphate groups are displayed mostly in the major groove. Un- or mis-matched bases in RNA duplexes widen the major groove and provide surface-exposed pockets favorable for ligand binding. The binding pockets that are formed by various RNA secondary and tertiary structures provide the basis for selective targeting of RNA with protein and small molecule ligands.13 These structured sites bring appropriate groups into the correct topology to selectively recognize and interact with complementary groups on the small molecule in a manner very similar to the shape dependent interactions of small molecules with proteins.

1.4 Molecular recognition of RNA by small molecules There are several different ways for a molecule to interact with RNA. Non-covalent or reversible interactions play a substantial role in determining binding specificity. Electrostatic interactions with the phosphate backbone are ideal nonspecific interactions and are commonly observed in most small molecule-RNA complexes. These interactions are important for enhancing the binding of small molecules and generally occur along the exterior of the helix. 3

A second mode of binding is groove-bound association. This interaction involves direct hydrogen bonding or van der Waals interaction with the nucleic acid base in the deep major groove or shallow minor groove of the RNA helix. The hydroxyl and amine groups of aminoglycosides form hydrogen bonds with RNA backbone and play important roles in RNA binding of aminoglycosides. The electrostatic and groove-binding modes don’t require a change in the conformation, but alteration in the structure of RNA is possible upon binding.20 The third type of binding mode is intercalation, which is observed between aromatic ligands and RNA bases. This binding mode requires a distortion of RNA helix, in order to accommodate the binding ligand. Stacking interactions are relatively non-specific, but diverse functional groups can be introduced into the stacked scaffolds in order to render binding specificity. These stacking interactions were observed in several aminoglycoside-RNA complexes as binding between sugar ring and RNA bases.

1.5 RNA-small molecule binding Research in the field of targeting RNA has increased after the success of aminoglycoside antibiotics, which target ribosomes and specifically ribosomal RNA in a number of microorganisms. The aminoglycosides continue to provide very important paradigms for understanding RNA recognition. A number of highly specific RNA-small molecule complexes have been recently reviewed by Jason Thomas and Paul Hergenrother.13 In recent years, three new classes of RNA molecules have been identified as drug targets. The first class is catalytic RNA molecules or ribozymes, which include the hammerhead ribozyme, and self splicing group I introns. The second class, called aptamers is comprised of sequences that have been selected in vitro for specific and high affinity binding. The third class is RNA targets containing protein binding sites, include TAR (trans-activating region), and RRE (Rev responsive element) of HIV. 4

RNA is also an attractive target for a number of emerging human diseases that are caused by RNA viruses including HIV and hemorrhagic fever viruses such as dengue and Ecoalb21 and controlling various biological functions. There has been some success in the development of agents that target the TAR and RRE hairpins that control HIV replication. Recently determined structures of the RNA modules involved in unique RNA functions, such as ribozymes in splicing, have revealed targets for the development of drugs that inhibit cellular functions at very important steps.22 Small molecules have been shown to be able to recognize diverse RNA structural motifs such as duplexes, loops, bulges, and pseudoknots. The main objective of the work described in this thesis is the targeting of RNA hairpins and mismatches, including tetra-loops containing dangling ends and U-U mismatches in RNA duplexes. Small molecules targeted to various RNA structural motifs will be briefly reviewed in the following discussion.

1.5.1 Duplex binders Small molecules may bind to the duplex region of RNA through intercalation or groove binding. Intercalation occurs between the base pairs and is governed by stacking interactions. Groove binding is governed by shape complementarity and hydrogen bonding, although charge and steric effects are also observed to play roles in groove binding. Groove binders can be further divided into two subclasses: minor groove binders and major groove binders. The most potent aminoglycoside antibiotic neomycin (1, Figure 1-2), was shown to bind to the lower stem of the HIV-1 TAR RNA.23 This was confirmed by NMR data obtained for the HIV-1 TAR RNA-Neomycin B complex, which showed that neomycin binds to the minor groove of the TAR lower stem. This binding site is quite unusual because aminoglycosides generally bind in the major groove of other RNA targets.24 Another aminoglycoside derivative, 5

tobramycin (2, Figure 1-2), was found to bind to the major groove of polymeric RNA duplex poly(rI).poly(rC), which exhibits characteristic A-type conformation. A combination of spectroscopic, calorimetric, viscometric, and computer modeling techniques characterized the binding of tobramycin to duplex RNA.25 2-Phenylquinoline derivatives (3, Figure 1-2) were shown to bind to poly A-U RNA duplex by intercalation. The substituent at position X controls the affinity of the binding. Highest affinity of binding was observed when the piperazyl substituent is at the para position.26 Kinetic and modeling results suggest that para-substituted phenylquinoline binds to duplex RNA by threading intercalation while the ortho- and meta- substituted phenylquinoline bind to duplex RNA by classical intercalation. Threading intercalation is a useful strategy for enhanced affinity and specificity.27 Wilson and coworkers showed that most of their diphenylfuran derivatives (4, Figure 1-2) binds to a poly A-U RNA duplex by intercalation instead of by association with the minor groove, even though diphenylfurans recognize the poly A-T DNA duplex by association with the minor groove.28

4a

Figure 1-2. Small molecules binding to RNA duplexes. 6

1.5.2 External loop or Hairpin binders Hairpin loops are the most abundant secondary structures after the duplex region.29 Hairpin loops are formed when a sequence folds back on it self to form a duplex that is linked through a single strand of nucleotides. These are also called stem loops. The loop size varies based on the number of unpaired bases present in the single strand. The stability is controlled by the loop size; hexa- and hepta loops have been determined to be the favored loop size because six to seven nucleotides is of ideal length to span for A-form helix. These loops mainly provide sites of nucleation for RNA folding and participate in RNA-protein and RNA-RNA interactions.30

Figure 1-3. RNA hairpin loop binders. Despite the importance of loops only few small molecules have been shown to bind stem loops of any size or sequence. Mei and co workers were the first to find out a small molecule (5, Figure 1-3) with demonstrated affinity to hairpin loops. This compound was showed to bind to loop region of TAR RNA (A, Figure 1-4) which inhibited the interaction between Tat and TAR 7

of HIV virus.31 Our laboratory investigated the binding of small molecule (6, Figure 1-3) to stem loop 2 of U1A snRNA (B, Figure 1-4) which interact with U1A protein in the spliceosome. This compound was shown to inhibit the U1A-RNA complex with an IC50 of 1.0 µM.32 Our laboratory also investigated a series of commercially available RNA-binding compounds, that were identified by computational docking and found to bind the GNRA tetraloop (C, Figure 1-4) and stem loop 3 RNA of the packaging signal Ψ of HIV-1 (D, Figure 1-4).33-35 Compounds 7 and 8 (Figure 1-3) were reported to bind the GNRA tetraloop and stem loop 3 RNA with binding constant 0.7 µM and 1.4 µM, respectively. Hergenrother and co-workers designed 2-deoxystreptamine (DOS) dimers, compounds 9, 10, 11, and 12 (Figure 1-3) and investigated their binding to various hairpin loop structures (EH, Figure 1-4). In one study, compound 9 was found to bind to E, F, G, and H (Figure 1-4) hairpin RNAs with 34 µM, 11 µM, 11 µM, and 8 µM dissociation constants, respectively.36 Compound 11 and 12 were found to be selective towards tetraloops and octaloops, respectively, and bound with weaker affinity to other RNA hairpins and double stranded RNA.37

Figure 1-4. RNA hairpin loop structures used for the discovery of various stem loop binders. 8

1.5.3 Internal loop binders Internal loops can involve symmetric or asymmetric loops within the duplex. These loops contain one (mismatch loop) or more unpaired or mispaired bases on each strand of duplex. These sites generally serve to make the major groove of the RNA more accessible, and they can also more easily undergo conformational changes when bound to ligands compared to duplex RNA.19 Small molecules binding to internal loops in the 16S A-site rRNA, RRE RNA, and thymidylate synthase mRNA has been extensively studied.13 After the initial discovery of aminoglycoside antibiotics, which target 16S A-site rRNA, scientists modified them to make better drugs with less toxicity and more activity. Wong and coworkers synthesized a library of bifunctional aminoglycosides by dimerizing neamine (13, Figure 1-5) to overcome the problem of aminoglycoside-modifying enzymes.38 A series of alkyllinked neamine dimers of varying tether lengths linked through amide or carbamate linkages were synthesized. Compound 14 (KD = 0.04 µM) (Figure 1-5) binds with higher affinity to 16S A-site over neamine (KD = 10 µM). After this initial success with neamine dimers, simple sugars were dimerized including 2-deoxystreptamine (DOS), 6’-aminoglucosamine, and glucosamine, which are the core building blocks of neamine and paromamine ring systems.39 The binding of these molecules to the 16S A site was studied by ESI-MS. The dimers were found to bind better than respective monomers to 16S A-site, but not better than neamine dimers. Mobashery and coworkers identified some of the neamine derivatives with enhanced binding affinity for 16S A-site RNA and reduced the susceptibility to aminoglycoside modifying enzymes using a computational approach.40 Boons and co-workers synthesized a series of neamine mimics by replacing the sugar units with various disaccharides. In order to identify mimics of aminoglycosides that are more synthetically tractable for chemical optimization, Ding and co9

workers synthesized a library of heterocyclic 2-deoxystreptamine conjugates substituted at the 4positon.41

Figure 1-5. RNA internal loop binders The diphenylfuran derivative (4 a, Figure 1-2) was found to bind to the HIV-1 RRE RNA internal loop by a threading intercalating mode.42 Chemical foot-printing experiments revealed that it bound to the RRE RNA more strongly than to the duplex RNA. It was proposed that this compound binds to the G-C base pair just below the critical internal loop to exert its inhibitory effects. In contrast, another diphenylfuran derivative (4 b, Figure 1-2) was shown to selectively recognize the poly A-U duplex. This shows the possibility of tuning binding specificity by changing the substituents of the diphenylfuran scaffold. 10

Tor and co-workers enhanced the affinity and specificity of the interaction of aminoglycosides with RRE by replacing the amino group with guanidinium group.43 Tor and coworkers also developed aminoglycosides conjugated with an acridine intercalator.44 The acridine conjugate and guanidinoglycoside of neomycin (15, 16 Figure 1-5) inhibited the Rev-RRE complex with IC50 value of 0.04 µM and 0.8 µM, respectively, which is better than neomycin (1, Figure 1-2, IC50 7.0 µM). Cho and co-workers screened various DNA intercalators and minor groove binding compounds (Hoechst 33258, DAPI, distamycin A) for their ability to bind to thymidylate synthase mRNA. Hoechst 33258 compound is the most effective compound with a binding constant of 60 nM. In general the groove binding compounds bound better than intecalators.45

1.5.4 Bulge binders Bulges are formed when there are an unequal number of bases in the duplex strands. The unpaired nucleotide in a single base bulge can either stack in the duplex or loop out into solution depending on the base composition. Bulges are always destabilizing towards duplex formation and become more destabilizing with increasing of bulge size.46-47 Weeks and Crothers have suggested that certain base bulges will lead to a greater major groove accessibility, thus creating sites for binding by proteins and small molecules.48 The most commonly targeted RNA bulges are the trans activating region (TAR) RNA, IRE (iron response element) RNA, and T-box RNA. The initial success of the binding of ethidium to TAR RNA lead Baily and co-workers to examine the ability of several intercalators (ethidium, proflavine), DNA minor groove compounds (DAPI, netropsin, berenil, and Hoechst 33285), and a threading intercalator (amsacrine-4-carboxamide derivative SN16713) (17, Figure 1-6) to bind TAR RNA. All of the compounds bound to TAR except netropsin (18, Figure 1-6), with intercalative mode of 11

binding.49 An aminoquinoline derivative was discovered from high throughput screening to inhibit the Tat-TAR interaction by binding to the same bulge where the Tat protein binds.23 This compound bound to RNA using stacking interactions and hydrogen bonds and has a relatively low molecular weight.4, 50 A 3, 4, 5-trisubtituted oxazolidinone library was evaluated for their binding with T-box RNA because of their low charge, which showed decrease non-specific interactions with RNA.51 The natural product yohimbine (19, Figure 1-6) was determined to bind to multiple secondary structures of the fIRE, which prevent the binding of IRP to IRE and increase the levels of ferritin protein by increasing the translation process.52

Figure 1-6. Examples for RNA bulge binders

1.6 Affinity vs specificity Affinity describes how strongly a ligand binds to its target, while specificity is a measure of how preferentially a ligand binds to a target molecule over other molecules. The ligands designed for targeting RNA need to bind RNA with both affinity and specificity. The affinity of ligands binding to RNA targets can be increased by conjugating ligands having two different binding modes44, 57-58, or by designing dimerized ligands.36-37, 59 Tor and co-workers enhanced the affinity and specificity of aminoglycosides interaction with RRE by replacing the amino group with guanidinium group.43 The same group also designed a series of acridine conjugated 12

aminoglycosides and aminoglycoside dimers that bind to the HIV-1 RRE RNA with high affinity.44, 60 They also designed a series neomycin-acridine conjugates that bind to the HIV-1 RRE RNA with varied linker length that is a good example of increasing specificity of ligands. The neo-acridine with shortest linker prefers the RRE RNA, the longest linker prefers duplex DNA and the intermediate linker prefers duplex RNA. These results shows that the specificity can be controlled by the linker length and affinity can be enhanced by the conjugation strategy. Compared to neomycin-acridine conjugates the tobramycin-acridine and kanamycinacridine conjugates have slightly lower affinity for RRE but higher specificity for the RRE, which suggests that there is an inverse relationship between affinity and specificity. Thus designing ligands with high affinity and specificity is possible, but achieving both in one design is a difficult task. Although improved affinity and specificity were obtained using aminoglycoside dimers, aminoglyside-acridine conjugates, and guanidinoglycosides strategies, a drawback of these approaches is that the molecules become larger, and thus, are not as useful as drug leads.

1.7 Conclusion The knowledge of RNA-small molecule recognition is important to design and synthesize specific RNA binders. Thus far, it has been demonstrated that ligands generally bind to RNA by taking advantage of specific RNA structures, such as internal loops and base bulges, in which the deep and narrow major groove has become accessible for binding interactions. The recognition of RNA targets by their natural ligands with high specificity involves a multitude of functional groups. Furthermore, ligands may exploit the regions of flexibility in the RNA, in which the binding site can be adapted for specific and tight binding by a small molecule. Because proteins appear to use similar rules for binding of RNA, the natural protein binding sites may be desirable 13

targets for drug design. Both natural products and designed ligands can be used to exploit structure-specific recognition in the same way that protein or other biological effectors (e. g., antibiotics) bind to RNA. Thus, knowledge of the structural basis of these RNA-ligand interactions will ultimately provide an impetus for rational drug discovery.

1.8 References 1. Bloomfield, V. A.; Crothers, D. M.; Tinoco, I. Jr. Nucleic Acids: Structures, Properties, and Functions; University Science Books: Sausalito, 2000. 2. Nierhaus, K. H.; Wilson, D. N. Wiley-VCH: Weinheim, Germany 2004. 3. Harvey, I.; Garneau, P.; Pelletier, J. Inhibition of translation by RNA-small molecule interactions. RNA 2002, 8, 452-463. 4. Herman, T. Strategies for the design of drugs targeting RNA and RNA-protein complexes. Angew. Chem. Int. Ed. 2000, 39, 1890-1905. 5. Sucheck, S. J.; Wong, C. RNA as a target for small molecules. Curr. Opin. Chem. Biol. 2000, 4, 678-686. 6. Hall, K. B. RNA-protein interactions. Curr. Opin. Struct. Biol. 2002, 12, 283–288. 7. Wimberly, B. T.; Brodersen, D. E.; Clemmons, W. M.; Morgan-Warren, R. J.; Carter, A. P.; Vonrhein, C.; Hartsch, T.; Ramakrishnan, V. Structure of the 30S ribosomal subunit. Nature 2000, 407, 327–339. 8. Cate, J. H.; Gooding, A. R.; Podell, E.; Zhou, K.; Golden, B. L.; Kundrot, C. E.; Cech, T. R.; Doudna, J. A. Crystal structure of a group I ribozyme domain: principles of RNA packing. Science 1996, 273, 1678–1685. 9. Pearson, N. D.; Prescott, C. D. RNA as a drug target. Chem. and Biol. 1997, 4, 409-414. 10. Hermann, T. Drugs targeting the ribosome. Curr. Opin. Struct. Biol. 2005, 15, 355-366. 14

11. Foloppe, N.; Matassova, N.; Faboul-ela, F. Towards the discovery of drug-like RNA ligands?. Drug Discovery Today. 2006, 11, 1019-1027. 12. Sucheck, S. J.; Greenberg, W. A.; Tolbert, T. J.; Wong, W.-H. Design of small molecules that recognize RNA: development of aminoglycosides as potential antitumor agents that target oncogenic RNA sequences. Angew. Chem. Int. Ed. 2000, 39, 1080–1083. 13. Thomas, J. R.; Hergenrother, P. J. Targeting RNA with small molecules. Chem. Rev. 2008, 108, 1171-1224. 14. Graves, P. R.; Kwiek, J. J.; Fadden, P.; Ray, R.; Hardeman, K.; Coley, A. M.; Foley, M.; Haystead, T. A. J. Discovery of novel antimalarial targets in the human purine binding proteome. Mol. Pharm. 2002, 62, 1364-1372. 15. Batey, R. T.; Rambo, R. P.; Doudna, J. A. Tertiary motifs in RNA structure and folding. Angew. Chem. Int. Ed. 1999, 38, 2326-2343. 16. Holbrook, S. R. Structural principles from large RNAs. Annu. Rev. Biophys. 2008, 37, 445464. 17. Moore, P. B. Structural motifs in RNA. Annu. Rev. Biochem. 1999, 68, 287-300. 18. Leontis, N. B.; Westhof, E. Analysis of RNA motifs. Curr. Opin. Struct. Biol. 2003, 13, 300308. 19. Weeks, K. M.; Crothers, D. M. Major groove accessibility of RNA. Science 1993, 261, 15741577. 20. Chow, C. S.; Bogdan, F. M. A. Structural basis for RNA-ligand interactions. Chem. Rev. 1997, 97, 1489-1513. 21. Oldstone, M. B. A.; Viruses, Plagues, and History. Oxford Univ. Press: Oxford, 1988.

15

22. Wilson, W. D.; Li, K. Targeting RNA with small molecules Curr. Med. Chem. 2000, 7, 7398. 23. Mei, H.-Y.; Cui, M.; Heldsinger, A.; Lemrow, S. M.; Loo, J. A.; Sannes-Lowery, K. A.; Sharmeen, L.; Czarnik, A. W. Inhibitors of protein-RNA complexation that target the RNA: specific recognition of human immunodeficiency virus type 1 TAR RNA by small organic molecules. Biochemistry 1998, 37, 14204-14212. 24. Faber, C.; Sticht, H.; Schweimer, K.; Rosch, P. Structural rearrangements of HIV-1 Tatresponsive RNA upon binding of Neomycin B. J. Biol. Chem. 2000, 275, 20660-20666. 25. Jin, E.; Katritch, V.; Olson, W. O.; Kharatisvili, M.; Abagyan, R.; Pilch, D. S. Aminoglycoside binding in the major groove of duplex RNA: The thermodynamic and electrostatic forces that govern recognition. J. Mol. Biol. 2000, 298, 95-110. 26. Zhao, M.; Janda, L.; Nguyen, J.; Strekowski, L.; Wilson, W. D. The interaction of substituted 2-phenylquinoline intercalators with poly(A)-ploy(U): classical and threading intercalation modes with RNA. Biopolymers 1994, 34, 61-73. 27. Zhao, M.; Ratmeyer, L.; Peloquin, R. G.; Yao, S.; Kumar, A.; Spychala, J.; Boykin, D. W.; Wilson, W. D. Small changes in cationic substituents of diphenylfuran derivatives have major effects on the binding affinity and the binding mode with RNA helical duplex. Bioorg. Med. Chem. 1995, 3, 785-794. 28. Gooch, B. D.; Beal, P. A. Recognition of duplex RNA by helix-threading peptides. J. Am. Chem. Soc. 2004, 126, 10603-10610. 29. Varani, G.; Cheong, C.; Tinoco, I. Jr. Structure of an unusually stable RNA hairpin. Biochemistry 1991, 30, 3280-3289.

16

30. Varani, G. Exceptionally stable nucleic acid hairpins. Annu. Rev. Biophys. Biomol. Struct. 1995, 24, 379. 31. Mei, H. Y.; Cui, M.; Heldsinger, A.; Lemrow, S. M.; Loo, J. A.; Sannes-Lowery, K. A.; Sharmeen, L.; Czarnic, A. W. Inhibitors of protein-RNA complexation that target the RNA: specific recognition of human immunodeficiency virus type 1 TAR RNA by small organic molecules. Biochemistry 1998, 37, 14204-14212. 32. Gayle, A. Y.; Baranger, A. M. Inhibition of the U1A-RNA complex by an aminoacridine derivative. Bioorg. Med. Chem. 2002, 12, 2839-2842. 33. Yan, Z.; Baranger, A. M. Binding of an aminoacridine derivative to a GAAA RNA tetraloop. Bioorg. Med. Chem. Lett. 2004, 14, 5889-5893. 34. Yan, Z.; Sikri, S.; Beveridge, D. L.; Baranger, A. M. Identification of an aminoacridine derivative that binds to RNA tetraloops. J. Med. Chem. 2007, 50, 4096-4104. 35. Warui, D. M; Baranger A. M. Identification of specific small molecule ligands for stem loop 3 ribonucleic acid of the packaging signal Ψ of human immunodeficiency virus-1 J. Med. Chem . 2009, 52, 5462-5473. 36. Liu, X.; Thomas, J. R.; Hergenrother, P. J. Deoxystreptamine dimers bind to RNA hairpin loops. J. Am. Chem. Soc. 2004, 126, 9196-9197. 37. Thomas, J. R.; Liu, X.; Hergenrother, P. J. Size-specific ligands for RNA hairpin loops. J. Am. Chem. Soc. 2005, 127, 12434-12435. 38. Sucheck, S. J.; Wong, A. L.; Koeller, K. M.; Boeher, D. D.; Draker, K-A.; Sears, P.; Wright, G. D.; Wong, C.-H. Design of bifunctional antibiotics that target bacterial rRNA and inhibit resistance-causing enzymes. J. Am. Chem. Soc. 2000, 122, 5230-5231.

17

39. Wu, B.; Yang, J.; Robinson, D.; Hofstadler, S.; Griffey, R.; Swayze, E. E.; He, Y. Synthesis of linked carbohydrates and evaluation of their binding for 16S RNA by mass spectrometry. Bioorg. Med. Chem. Lett. 2003, 13, 3915-3918. 40. Haddad, J.; Kotra, L. P.; Llano-Sotelo, B.; Kim, C.; Azucena, E. F., Jr.; Liu, M.; Vakulenko, S. B.; Chow, C. S.; Mobashery, S. Design of novel antibiotics that bind to the ribosomal acyltransfer site. J. Am. Chem. Soc. 2002, 124, 3229-3237. 41. Ding, Y.; Hofstadler, S. A.; Swayze, E. E.; Griffey, R. H. An efficient synthesis of mimetics of neamine for RNA recognition. Org. Lett. 2001, 3, 1621-1623. 42. Ratmeyer, L.; Zapp, M. L.; Green, M. R.; Vinayak, R.; Kumar, A.; Boykin, D. W.; Wilson, W. D. Inhibition of HIV-1 Rev-RRE interaction by diphenylfuran derivatives. Biochemistry 1996, 35, 13689-13696. 43. Luedtke, N. W.; Baker, T. J.; Goodman, M.; Tor, Y. Guanidinoglycosides: a novel family of RNA ligands. J. Am. Chem. Soc. 2000, 122, 12035-12036. 44. Kirk, S. R.; Luedtke, N. W.; Tor, Y. Neomycin-acridine conjugate: a potent inhibitor of RevRRE binding. J. Am. Chem. Soc. 2000, 122, 980-981. 45. Cho, J.; Rando, R. R. Specific binding of Hoechst 33258 to site 1 thymidylate synthase mRNA. Nucleic Acids Res. 2000, 28, 2158-2163. 46. Znosko, B. M.; Silvestri, S. B.; Volkman, H.; Boswell, B.; Serra, M. J. Thermodynamic parameters for an expanded nearest-neighbor model for the formation of RNA duplexes with single nucleotide bulges. Biochemistry 2002, 41, 10406-10417. 47. Mathews, D. H.; Sabina, J.; Zuker, M.; Turner, D. H. Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. J. Mol. Biol. 1999, 288, 911-940. 18

48. Weeks, K. M.; Crothers, D. M. RNA recognition by Tat-derived peptides: interaction in the major groove. Cell 1991, 66, 577-588. 49. Bailly, C.; Colson, P.; Houssier, C.; Hamy, F. The binding mode of drugs to the TAR RNA of HIV-1 studied by electric linear dichroism. Nucleic Acids Res. 1996, 24, 1460-1464. 50. Hermann, T. Chemical and functional diversity of small molecule ligands for RNA. Biopolymers 2003, 70, 4-18. 51. Means, J.; Katz, S.; Nayek, A.; Anupam, R.; Hines, J. V.; Bergmeier, S. C. Structure-activity studies of oxazolidinone analogs as RNA-binding agents. Bioorg. Med. Chem. Lett. 2006, 16, 3600-3604. 52. Tibodeau, J. D.; Fox, P. M.; Ropp, P. A.; Theil, E. C.; Thorp, H. H. The p-regulation of ferritin expression using a small molecule ligand to the native mRNA. Proc. Natl. Acad. Sci. U.S.A. 2006, 103, 253-257. 53. Zimmermann, G. R.; Jenison, R. D.; Wick, C. L.; Simorre, J.-P.; Pardi, A. Interlocking structural motifs mediate molecular discrimination by a theophylline-binding RNA. Nat. Struc. Biol. 1997, 4, 644-649. 54. Nix, J.; Sussman, D.; Wilson, C. The 1.3 angstrom crystal structure of a biotin-binding pseudoknot and the basis for RNA molecular recognition. J. Mol. Biol. 2000, 296, 1235-1244. 55. Yang, Y.; Kochayan, M.; Burgstaller, P.; Westhof, E.; Famulok, M. Structural basis of ligand discrimination by two related RNA aptamers resolved by NMR spectroscopy. Science 1996, 272, 1343-1347. 56. Dieckmann, T.; Suzuki, E.; Nakamura, G. K.; Feigon, J. Solution structure of an ATPbinding RNA aptamer reveals a novel fold. RNA 1996, 2, 628-640.

19

57. Takahashi, T.; Hamasaki, K.; Kumagai, I.; Ueno, A.; Mihara, H. Design of a nucleobaseconjugated peptide that recognizes HIV-1 RRE IIB RNA with high affinity and specificity. Chem. Commun. 2000, 349-350. 58. Lee, J.; Kwon, M.; Lee, K. H.; Jeong, S.; Hyun, S.; Shin, K. J.; Yu, J. An approach to enhance specificity against RNA targets using heteroconjugates of aminoglycosides and chloramphenicol (or linezolid). J. Am. Chem. Soc. 2004, 126, 1956-1957. 59. Tok, J. B.-H.; Dunn, L. J.; Jean, R. C. D. Binding of dimeric aminoglycoside to the HIV-1 Rev responsive element RRE RNA construct. Bioorg. Med. Chem. Lett. 2001, 11, 1127-1131. 60. Leudtke, N. W.; Liu, Q.; Tor, Y. RNA-ligand interaction: affinity and specificity of aminoglycoside dimers and acridine conjugates to the HIV-1 Rev responsive element. Biochemistry 2003, 42, 11391-11403.

20

CHAPTER 2 Cooperative Binding of a Quinoline Derivative to an RNA Hairpin Loop Containing a Dangling End 2.1 Introduction As discussed in the introduction, RNA plays essential roles in gene replication, transcription, and translation, which direct protein synthesis in living organisms.1-3 Therefore, small molecules that bind to RNA and affect any of these biological processes would be of great utility.4-7 RNA hairpin secondary structures, form target sites for both proteins and other RNAs, and are the most abundant form of secondary structure after the helix.8-15 However, there have been few studies to investigate small molecule binding to RNA hairpins.7 Among the different types of hairpins, tetraloops (four nucleotide stem loop structures) are abundant in ribosomal RNA. Studies have shown that 55% of hairpins are tetraloops in 16S rRNA whereas, pentaloops account only for 13% of the total loops and a similar pattern was observed in 23S rRNA. GNRA, UNCG, (U/A)GNN, GGNG, and CUUG (N stands for any nucleotide, R stands for purine and Y stands for pyrimidine) sequences are conserved among different possible tetraloop sequences.1718

GNRA and UNCG are the most common tetraloops in ribosomal RNA and have unusual

thermodynamic stabilities.19-20 GNRA tetraloops are known to play important roles in maintaining tertiary contacts in ribosomal RNA,9, 10 catalytic RNA,8, 11, 13, 21 and the viral internal ribosome entry site (IRES).22, 23 These RNAs are also known to interact with proteins such as SRP proteins, ribosomal proteins, elongation factors, sarcin, and ricin toxins.24-27 The stability of these hairpin loops depend on the closing base pairs and the dangling end regions.28 The

21

dangling end is a non-helical structure present at the junction of double and single-stranded RNA. Thermodynamic studies investigating stabilization of duplex RNA structures by dangling ends have shown that 3’-dangling end have a larger stabilizing effect on RNA than 5’-dangling ends.29-34 Dangling ends also play important roles in biological functions other than stabilizing the RNA structures. For example, the dangling nucleotides (5’ACCA3’) present at 3’-terminal end of tRNA stabilize the cloverleaf structure of tRNA and the interaction between mRNA and tRNA adjacent to the codon-anticodon pair.35 Recent studies showed that 2-3 nt dangling ends are important in RNAi functionality.36-39 In our previous studies, we used virtual screening of the NCI diversity database to identify a small organic molecule QD1 that selectively recognizes a 3’dangling end of a RNA hairpin.40 Here, we report studies of QD1 binding to tetraloops by ITC (Isothermal Titration Calorimetry), identification QD2 by virtual screening of the entire NCI database (250,000 compounds) for QD1 like molecules, and QD2 binding cooperatively to a RNA hairpin containing a dangling end.

2.2 Results and discussion 2.2.1 Selective recognition of the 3’-dangling end of a hairpin by QD1 Dr. Zhaohui Yan, a previous lab member of the Baranger Group, screened the NCI diversity set of 1990 compounds to identify small molecules that bind to RNA tetraloops. This screening identified a set of quinoline and acridine derivatives that bind to RNA tetraloops. The binding of an aminoacridine derivative (AD1), (AD2), and quinoline derivative QD1 to GNRA RNA tetraloops was characterized.40-42 The structure of GNRA tetraloops was previously solved by NMR spectroscopy and the NMR and secondary structure are shown in Figure 2-1.43 Dr. Yan 22

performed NMR diffusion experiments to determine the binding affinity of QD1 to RNA hairpins containing varied closing base pair and dangling end sequences. I confirmed the NMR experimental results using ITC (Isothermal Titration Calorimetry). The results of these experiments suggested that QD1 recognizes the RNA tetraloop on the basis of the terminal base pair and dangling end (Figure 2-2).

Figure 2-1. NMR structure of the GCAA tetraloop (PDB code: 1ZIH)30 and the secondary structure of the tetraloop.

Figure 2-2. QD1 recognizes the shaded region shown in the GCAA tetraloop.

2.2.2 Isothermal titration calorimetry assays The recognition of three different RNA hairpins (Figure 2-3) by QD1 was studied using an isothermal titration calorimetry assay. The results were compared to the NMR diffusion experimental results. The advantage of ITC is there is no need to label the macro molecule, the 23

interaction can be determined directly by measuring heat evolved or absorbed due to binding in solution, and the thermodynamic parameters for different binding events can also be determined. The data were fit using a sequential two-site binding model (Figure 2-4). The first binding constant values were within error of those determined by NMR diffusion methods. The second binding constant values were more than ten fold larger than the first and may result from aggregation of QD1, which we observed at higher concentrations in the NMR experiments (Table 2-1).

AU(UAU)

CG(UAU) GU(UAU)

Figure 2-3. Secondary structures of RNA hairpins used for experimental studies.

(b) (a)

Fgure 2-4. ITC profiles for the binding of QD1 to (a) AU(UAU) (b) CG(UAU) 24

Figure 2-4 (Cont.)

(c)

Figure 2-4. ITC profiles for the binding of QD1 to (c) GU(UAU) RNA hairpins. Plots of data are from a representative ITC experiment upon titration of 50 µM of each RNA hairpin with QD1 (5 mM) in 10 mM sodium phosphate buffer (pH 6, 50 mM NaCl). The heat of ligand binding for each injection was determined by subtracting the heat of ligand solvation from that of the ligand-RNA injection to yield the heat due solely to ligand binding for each injection. Binding constants were determined from plots of the heat of ligand binding as a function of RNA-ligand molar ratio. The data were fit with a sequential two-site binding model. Table 2-1. Comparison of binding constants from NMR and ITC experiments.

2.2.3 Virtual screening of the NCI database To identify molecules related to QD1 (Figure 2-6) that bind RNA tetraloops, we performed a computational substructure and/or 3D search of the NCI database of 250,000 compounds. Out of the first 250 quinoline hits, forty molecules containing a quinoline 25

substructure with a molecular weight of more than 200 and having different chemical scaffolds were selected. Of these forty molecules, 20 were selected for computational docking using MOE based on predicted water solubility and molecular weight.44 The program AutoDock 3 was used to perform computational docking of these 20 molecules with the GNRA tetraloop shown in Figure 2-1. The steps followed for screening the NCI database are shown in Figure 2-5. AutoDock has been validated by several investigators as an effective tool to identify small molecule ligands for RNA.45-47 This program uses a scoring function that includes van der Waals, electrostatic, desolvation, hydrogen bonding, and ligand torsional energies.48 We have previously used AutoDock in combination with Dock to identify ligands for the GNRA tetraloop and stem loop 3 of the Ψ RNA of HIV.41-42, 49 The docking site was defined using the AutoGrid program with a grid box of 40.0 Å X 30.0 Å X 34.0 Å (x,y,z) centered at 3.429 Å (x), -2.135Å (y) and 5.836Å (z). The ten quinoline derivatives with a predicted binding energy of better than 9 kcal/mol were selected for experimental studies. The selected 10 quinoline derivatives were obtained from the NCI, and their purity was evaluated using NMR and mass spectrometry. The NMR and mass spectra of two of the 10 molecules were not consistent with the reported structure and thus, were not investigated further. The structures of the small molecules studied experimentally are shown in Figure 2-6. The compounds high-lighted in red were not identified by NMR and mass spectrometry. The affinities of the remaining 8 molecules for GCAA RNA were determined using fluorescence spectroscopy.

26

Figure 2-5. Flowchart representing the steps followed in screening of NCI database.

Figure. 2-6. List of compounds from NCI studied experimentally. The compounds high-lighted in red were not identified by NMR and mass spectrometry. The one shown in blue was the best hit and was evaluated further (QD2). 27

2.2.4 Screening of molecules by fluorescence assays The binding affinity of selected molecules for the GCAA tetraloop was measured by fluorescence titration experiments. The sequence of the GNRA tetraloop used in these experiments is shown in Figure 2-7. The change in the fluorescence signal was measured upon titration of 5’-fluorescein-labeled GCAA tetraloop RNA with increasing concentration of small molecule. A control experiment has been done for each experiment by titrating fluorescein with small molecule to make sure that the small molecules are not interfering with the fluorescein flurophore. None of the molecules altered the fluorescence signal of fluorescein in control experiments. The molecule NSC5920 precipitated above 10 µM, and NSC3618 showed fluctuations in the fluorescence signal. As a result, the approximate binding constants of these two molecules for tGCAA RNA were not determined. The summary of fluorescence screening results was shown in Table 2-2. Interestingly, one of the six molecules, NSC5485 (QD2) was found to bind with better affinity than QD1. Table 2-2. Summary of fluorescence screening assay results.

NCI compounds

Fluorescence KD ( µM)

NSC3618

Fluctuations in fluorescence intensity > 100 µM > 90 µM > 150 µM Start precipitating at 10 µM > 60 µM > 50 µM 8 µM

NSC3870 NSC 3616 NSC2455 NSC5920 NSC1010 NSC5491 NSC5485

The AutoDock results for all the molecules shown above have almost similar binding energy but only QD2 was found to bind better than other molecules. The QD2 binding curve has 28

‘S’ shape and was not fit properly with one or two binding site model. The QD2 was further evaluated for its high binding affinity and the shape of the binding curve. The compound is less soluble in water-DMSO (75 µM in 3.5% v/v DMSO) and the maximum concentration of 50 µM in 3.5% v/v DMSO used for our experimental studies. The effect of DMSO alone and with QD2 on the fluorescence signal is shown in Figure 2-7. DMSO increased the fluorescein signal whereas, the QD2 in DMSO quenched the signal. The results indicate that DMSO did not contribute to the quenching of the fluorescein signal upon addition of QD2.

(a)

(b)

FL-tGCAA Figure 2-7. (a) Titration of 5’ fluorescein labeled RNA (100 nM) with increasing amounts of DMSO alone (red) and QD2 (NSC5485) in DMSO (blue). (b) The sequence of the GNRA tetraloop (FL-GCAA) used for the fluorescence experiments. The typical fluorescence quenching spectrum with increasing concentration of QD2 is shown in Figure 2-8. The binding curves obtained looks like ‘S’ shaped curves. These S shaped binding curves are usually obtained due to cooperative interactions between small molecules and the presence of identical interacting sites. The experimental data were fit with using Eq. 1 to determine association constant. The binding curve (Figure 2-9) was fit with the following equation, supporting the interaction of four small molecules with one RNA target. (F-F0)/(Ff-F0) = (KL4)/(1+ KL4) 29

(1)

F is the fluorescence intensity of the sample, F0 is the initial fluorescence intensity, Ff is the final fluorescence intensity, K is the association constant and L is the ligand concentration. A dissociation constant of 8.2 ± 0.4 µM for the complex formed between QD2 and the GCAA tetraloop was obtained from an average of three titrations.

Figure 2-8. Fluorescence spectra for 5’-fluorescien labeled GCAA RNA titrated with increasing concentrations of QD2. Measurements were performed with an RNA concentration of 100 nM and QD2 concentrations ranging from 0 to 21 µM in 10 mM sodium phosphate buffer (pH=6, 50 mM NaCl). The intensity of the fluorescence signal of fluorescein decreased upon addition of QD2.

Figure 2-9. Plots for the fraction of RNA fluorescence signal quenched versus QD2 concentrations in the absence (left) and presence (right) of 0.01% Triton X-100 assuming a 1:4 binding stoichiometry. The fraction of RNA bound was calculated using the fluorescence emission at 520 nm by excitation at 490 nm. The data were fit to equation 1. The experiments were performed with 100 nM GCAA RNA and 0 to 34 µM QD2 in 10 mM sodium phosphate buffer (pH=6, 50 mM NaCl). 30

False positive hits from computational docking may arise for compounds that aggregate because the aggregates can be responsible for the observed binding rather than the isolated molecules.50-51 Because QD2 is sparingly soluble in water-DMSO (75 µM in 3.5% v/v DMSO) mixture, we performed experiments in 0.01% Triton X-100 in order to confirm that the binding seen was not due to aggregates. The effect of Triton X-100 on the aggregation of small molecules has been studied by Shoiket and co-workers.51 Concentrations of 0.01% are typically sufficient to prevent the aggregation. The dissociation constant obtained in presence of Triton X100 is 12 ± 0.5 µM which is similar to the measured binding constant in the absence of Triton X100. These results suggest that aggregates of QD2 are not responsible for the observed binding. The binding curve in the presence and absence of 0.01% Triton-X 100 is shown in Figure 2-9.

2.2.5 Stoichiometry of binding The unusual stoichiometry of binding was further investigated using a fluorescence assay. Because the compound precipitates at higher concentrations, reverse titration was used to determine the stoichiometry of the complex formed between QD2 and the GCAA tetraloop. For these experiments, the QD2 concentration must be significantly above the KD. Therefore, these experiments were done with 25 µM concentration of QD2, which is 3-fold to the dissociation constant of QD2 to GCAA RNA tetraloop. The stoichiometry was determined by adding increasing concentrations of GCAA tetraloop to a constant concentration of QD2 (25 µM in 4.6% v/v DMSO in 10 mM sodium phosphate buffer, pH=6). The stoichiometry of binding was found to be 1:4, and an example of a stoichiometry experiment is shown in Figure 2-10. The results obtained correlate with the number of binding sites determined from the fluorescence binding experiment.

31

Figure 2-10. A plot of the fraction of RNA bound vs molar ratio ([RNA]/[ligand]) that was used to determine the stoichiometry of the QD2-GCAA RNA complex. The QD2 (25 µM in 3.5% v/v DMSO) was titrated with GCAA RNA in 10 mM sodium phosphate buffer (pH=6, 50 mM NaCl). The excitation and emission wavelengths were 310 nm and 369 nm, respectively.

2.2.6 Isothermal titration calorimetry experiment To confirm the binding constants and the stoichiometry of binding, experiments were performed using isothermal titration calorimetry (ITC). The binding constants were obtained by doing a reverse titration where the ligand was titrated with the GCAA RNA tetraloop. An example of the ITC experiments is shown in Figure 2-11. The binding curve obtained was fit with a sequential four site binding model. The comparison of the binding constants obtained from ITC and the fluorescence binding assay is shown in Table 2-3. The data obtained using ITC are consistent with the fluorescence data and support the presence of four identical interacting sites. Cooperativity is a common phenomenon in which two or more otherwise independent processes are thermodynamically coupled and generally involves a change in the molecular conformation of molecules.52 The absence of biphasicity in the binding curve of interaction indicates cooperativity between the four ligand binding sites, although it is not apparent from the calorimetric data alone.53-54 32

Figure 2-11. Plots of data from a reverse titration ITC experiment. QD2 (20 µM in 3.5% v/v DMSO) in 10 mM sodium phosphate buffer (pH=6, 50 mM NaCl) was titrated with increasing concentrations of 200 µM RNA at 25 0C. A standard experiment consisted of titrating 20 µM of QD2 (1.42 mL in sample cell) with 10 µL of 200 µM of RNA solution. A control experiment was performed that involved addition of aliquots (10 µL) of QD2 (20 µM) into buffer alone. The heat of ligand binding for each injection was determined by subtracting the heat of ligand solvation from that of the ligand-RNA injection to yield the heat due solely to ligand binding for each injection. Binding constants were determined from plots of heat of ligand binding as a function of RNA-ligand molar ratio. The data was fit with a sequential four-site binding model. Table 2-3. Comparison of binding constants for the binding of QD2 with RNA hairpin from fluorescence and ITC experiments.

Binding Constants (µM)

Fluorescence

Fluorescence (0.01% Triton)

ITC

K1 K2 K3 K4

8.2 ± 0.4 8.2 ± 0.4 8.2 ± 0.4 8.2 ± 0.4

12.0 ± 0.5 12.0 ± 0.5 12.0 ± 0.5 12.0 ± 0.5

10.8 ± 1.2 9.2 ± 2.4 8.1 ± 1.8 12.5 ± 4.2

33

2.2.7 CD spectroscopy A conformational change in the GCAA tetraloop was expected upon binding to four QD2 ligands and was confirmed by CD spectroscopy. These experiments were performed by titrating the GCAA tetraloop with QD2, and monitoring the CD spectrum from 240 nm to 320 nm. The DMSO present in the ligand buffer interferes with the signal below 240 nm. As shown in Figure 2-12, there is large change in the conformation of the tetraloop upon binding to ligand. The change in conformation makes the GCAA tetraloop bind to four ligands by creating new binding sites.

Figure 2-12. Circular dichroism spectra of GCAA (15 µM in 400 µL of buffer) titrated with increasing concentrations of QD2 (7.5 µM to 50 µM in 3.5% v/v DMSO). GCAA RNA alone is shown in pink in 10 mM sodium phosphate buffer (pH=6, 50 mM NaCl).

2.2.8 UV reverse titration experiment In UV titration experiment changes in the λmax of chromophore of the system is observed when there is binding between nucleic acid and small molecule. Because the DMSO interferes with the absorption of RNA, a reverse titration was performed. QD2 (20 µM) in buffer (1.65% v/v DMSO) was titrated with increasing concentrations of RNA (6.4 µM to 54 µM). The UV reverse titration experiments indicated the displacement of maximum absorption of QD2 from 329 nm to 335 nm (bathochromic shift of 6 nm) and a large hypochromism of 48.1% (0.316 to 34

0.152). The large hypochromism was observed due to perturbation of the complexed chromophore system up on binding to RNA. The titration experiment is shown in Figure 2-13.

Figure 2-13. The UV reverse titration experiment. QD2 (20 µM, purple) in 10 mM sodium phosphate buffer (pH=6, 50 mM NaCl), (1.65% v/v DMSO) was titrated with increasing concentrations of RNA (6.4 µM to 54 µM).

2.2.9 Binding specificity of QD2 molecule The binding specificity of the QD2 was investigated by studying its binding to single stranded, duplex, GCAA tetraloop and SL2 RNAs. Stem loop 2 RNA has a stable loop structure binds to a component of spliceosomal protein called U1A and GCAA tetraloop is without the dangling end sequence UAU present in the GCAA tetraloop RNA. The sequences of RNAs used for the experiment are shown in Figure 2-14. The binding studies were performed by fluorescence assay with 5’-fluorescein labeled RNAs. Stem loop 2 RNA was selected to study the specificity of QD2 to different loops and tetraloop RNA was selected to check the binding affinity without dangling end. The dissociation constant of the QD2 molecule with these RNAs did not differ much compared to the actual target (Table 2-4). These results suggest that the compound QD2 does not bind RNA specifically and its affinity for single stranded RNA is higher than for duplex RNA. 35

Figure 2-14. The sequences of RNA used for the specificity assessment studies by fluorescence assays. All the RNA sequences were labeled with fluorescein at the 5’-end. Table 2-4. The equilibrium dissociation constants for QD2 binding to different RNA targets.

RNA Single strand FL-tGCAA GCAA Duplex SL2

With out Triton With Triton X-100 (0.01% v/v) (µM) X-100 (µM) 4.15 ± 0.2 8.2 ± 0.4 7.5 ± 0.7 6.4 ± 0.4 9.2 ± 0.2

10.3 ± 0.49 12.0 ± 0.5 16.6 ± 0.29 18.3 ± 1.2

2.3 Conclusion The results indicate the unusual 1:4 stoichiometry of binding between the GCAA tetraloop and QD2 (NSC5485). The binding of dimerized molecules cooperatively to RNA may be the reason for this unusual stoichiometry. The compound is not specific for the GCAA tetraloop and binds with higher affinity to single stranded RNA than to other RNA structures. The interaction between molecules is probably governed by stacking and both stacking and electrostatic interactions with GCAA tetraloop. The cooperativity might be the reason for the greater affinity of QD2 with GCAA tetraloop compared other screened molecules. All the four binding sites are identical and there is cooperativity between the molecules binding. 36

2.4 Experimental section 2.4.1 Computational studies using AutoDock Computational work was performed on a Silicon Graphics Origin 200 with a CPU of 4 X R10000 @ 180 MHz, 512 MB RAM, and an Irix 6.5.5 operating system. AutoDock 3.0 was obtained free of charge from Molecular Graphics Laboratory of the Scripps Research Institute. MOE (Molecular Operating Environment) was used to add hydrogens and partial charges to the RNA and small molecules. The GCAA tetraloop NMR solution structure was obtained from the PDB (Brookhaven Protein data bank). The PDB code for the GCAA tetraloop is 1ZIH. Amber 94 force field was used to charge RNA and GESTAGER PEOE was used for charging small molecules. The docking energy obtained with AutoDock 3 was compared to the QD1 docking energy. Compounds which had a binding energy less than the predicted for QD1 were eliminated and the rest were used in experimental studies.

2.4.2 Materials and methods All the compounds tested were obtained from the Drug Synthesis and Chemistry Branch, Developmental Therapeutics Program, Division of Cancer Treatment and Diagnosis, National Cancer Institute (Bethesda, MD). The identity of the compounds was studied by NMR and mass spectrometry. RNA sequences were purchased either PAGE purified or crude from Dharmacon Research Inc. (Lafayette, CO). The PAGE purified RNA sequences were deprotected using the volatile deprotecting buffer provided and were lyophilized following the procedures provided by Dharmacon. The crude RNA sequences were first deprotected and then purified by 20% denaturing

polyacrylamide gel electrophoresis. The RNA bands were visualized by UV and excised from 37

the gel. RNA was extracted from the gel by TE (1M Tris-Cl, 0.5 M EDTA, pH=7.5) extraction. The RNA was dialyzed against water and desalted by two ethanol precipitations and then lyophilized. The purity and identity of RNA samples were confirmed using MALDI mass spectrometry. The RNA samples were annealed by heating to 95 0C for 2 min followed by cooling on ice for 5 min immediately before all the experimental studies.

2.4.3 RNA purification by denaturing PAGE A 20% polyacrylamide denaturing gel was prepared by mixing Sequa gel concentrate (60 mL), Sequa gel dilute (7.5 mL), and Sequa gel buffer (7.5 mL) from National Diagnostics. The 5 mL gel mixture was used to make the plug and before pouring the plug into the space between the glass plates, the solution was mixed with 20 µL APS (ammonium persulfate) and 10 µL TEMED (tetramethylethylenediamine, Aldrich). After the polymerization of the plug, 210 µL APS and 105 µL TEMED were added to the remaining 70 mL of gel mixture, mixed, and poured constantly until the entire plate was filled. An appropriate comb was inserted, and the gel was placed horizontally for an hour to allow the gel to polymerize before use. The 5X TBE running buffer was prepared by adding Tris base (54 g), Boric acid (27.5 g), EDTA (3.72 g) in 1 L autoclaved water. The 1X TBE buffer was prepared from the 5X TBE and used for running the gel. Formamide loading buffer was prepared by mixing, formamide (9.8 mL, Aldrich), EDTA (0.5 M, 200 µL, pH 8.0), Bromophenol blue (10 mg), and xylene cyanol (10 mg). The comb was removed from the polymerized gel, the gel was pre-run for 30 minutes at 45 watts. The bubbles and urea were removed form the wells using a syringe before loading the samples. The RNA sample (50 µL) was mixed with the formamide loading buffer (100 µL), and heated at 95 0C for 2 min, rapidly cooled on ice, and then loaded in to the wells. After running the gel at 45 watts for 3 hours, the glass plates were carefully removed. The gel was covered with 38

a clean sterile plastic wrap and placed on top of a TLC plate. The RNA band was visualized using an UV lamp (at 254 nm), was isolated using a sterile razor, and placed in to a 2 mL eppendorf tube. The excised RNA band was kept on dry ice for 15 min and then crushed with a sterile pipette. TE buffer (1 mL) was added to the tube, and the tube was left to shake slowly at 4 0

C for 6 hrs. The supernatant containing the RNA was collected and saved. The extraction

procedure was repeated three times to make sure that all RNA was recovered. The combined supernatant was dialyzed using MWCO 1000 membrane against 10X diluted TE (2 mL of 1 M Tris (pH 7.4) and 400 µL of 0.5 M EDTA (pH 8.0) diluted with autoclaved water) three times within a period of 24 hrs. The dialyzed RNA sample was dried using a speed-vac and was ethanol precipitated twice. The concentration was determined by UV spectrophotometer.

Ethanol precipitation The RNA sample was dissolved in 50 µL TE buffer (10 mM Tris pH 7.4, 1 mM EDTA), 2 µL of 5 M NaCl solution was added to achieve a final NaCl concentration of 200 mM, and three times the volume of chilled ethanol (150 µL) was added. The solution was vortexed, chilled on dry ice for 15 min, and centrifuged at 14,000 rpm at 4 0C. The supernatant was removed, leaving the RNA sample in the tube.

2.4.4 Fluorescence experiments FluoroMax-3 Spex spectrofluorometer from Jobin Yvon Inc. was used for fluorescence experiments. The RNA labeled with fluorescein at the 5’-end and QD2 dissolved in 100% DMSO were used for experiments. The RNA was titrated with QD2 and the change in fluorescence intensity of RNA was observed. The RNA was excited at 510 nm, and emission scans were obtained from 510 to 530 nm with an excitation slit width of 2 nm and an emission 39

slit width of 5 nm. The buffer intensity and background fluorescence was subtracted for each titration and an average of 5 scans were obtained. The fraction of RNA bound was calculated using the emission at 520 nm by dividing the difference between the observed fluorescence F and the initial fluorescence F0 by the difference between final fluorescence Ff and initial fluorescence as shown in equation 1. A control experiment in which DMSO alone was titrated in to the RNA sample was performed and these intensities for subtracted from the actual experiments. The fluorescence experiments were also performed in presence of 0.01% v/v Triton X-100 to observe the effect of aggregation of small molecule on fluorescence signal. The stoichiometry of the complex formed between QD2 and the GCAA tetraloop was determined by reverse titration. The QD2 was excited at 310 nm, and emission scans were obtained from 355 to 390 nm with an excitation slit width of 5 nm and an emission slit width of 5 nm. The maximum emission was observed at 369 nm. The plots of fraction bound vs the molar ratio ([RNA]/[ligand]) was used to find the stoichiometry.

2.4.5 Isothermal titration calorimetry experiments ITC experiments were performed at 25 0C on a MicroCal VP-ITC (MicroCal, Inc., Northampton, MA). The experiments were performed in 3.5% v/v DMSO in sodium phosphate buffer by reverse titration. A standard experiment consisted of titrating 20 µM of QD2 (1.42 mL in sample cell) with 10 µL of 200 µM of RNA solution in a syringe. The standard experiment was accompanied by a corresponding control experiment in which aliquots (10 µL) of QD2 (20 µM) were titrated into buffer (10 mM sodium phosphate pH=6) alone. The duration of each injection was 24 s, and the spacing between two injections was 240 s. The initial delay prior to the first injection was 60 s. The instrument measured the heat released for each injection in µcal/sec. The heat associated with each injection was measured by determining the area under 40

the curve using Origin version 5.0 software (Microcal, Inc., Northampton, MA, USA). The heat of ligand binding for each injection was determined by subtracting ligand solvation (Buffer titrated with ligand) from the corresponding heat associated with ligand-RNA injection to yield the heat due solely to ligand binding for each injection. Binding constants were determined from plots of heat of ligand binding as a function of ligand-RNA molar ratio. The graph was fit using a sequential four-site binding model.

2.4.6 CD spectroscopy CD experiments were carried out using a JASCO 715 spectropolarimeter. The spectra were recorded from 200 to 320 nm region. The RNA (15 µM of 400 µL in 10 mM sodium phosphate pH=6, 5 mM NaCl buffer) titrated with increasing concentrations of QD2 (7.5 µM to 50 µM) in 3.5% v/v DMSO. At least two spectral scans were obtained at a temperature of 25 0C in a 0.5 cm path length cell at a scan rate of 10 nm/min.

2.4.7 UV reverse titration experiments UV titration experiments were performed using a Shimadzu UV-2450 spectrophotometer. Reverse titration experiments were recorded from 320 to 350 nm region. The QD2 has maximum absorption at 329 nm. The QD2 (20 µM) in 10 mM sodium phosphate pH=6, 50 mM NaCl buffer (1.65% v/v DMSO) was titrated with increasing concentrations of RNA (6.4 µM to 54 µM).

41

2.5 References 1. Bloomfield, V. A.; Crothers, D. M.; Tinoco, I. Nucleic Acids: Structures, Properties, and Functions; University Science Books: Sausalito, 2000. 2. Nierhaus, K. H.; Wilson, D. N. Wiley-VCH: Weinheim, Germany 2004. 3. Harvey, I.; Garneau, P.; Pelletier, J. Inhibition of translation by RNA-small molecule interactions RNA. 2002, 8, 452-463. 4. Sucheck, S. J.; Greenberg, W. A.; Tolbert, T. J.; Wong, W.-H. Design of small molecules that recognize RNA: development of aminoglycosides as potential antitumor agents that target oncogenic RNA sequences. Angew. Chem. Int. Ed. 2000, 39, 1080–1083. 5. Chow, C.; Bogdan, F. A structural basis for RNA-ligand interactions. Chem. Rev. 1997, 97, 1489–1513. 6. Graves, P. R.; Kwiek, J. J.; Fadden, P.; Ray, R.; Hardeman, K.; Coley, A. M.; Foley, M.; Haystead, T. A. J. Discovery of novel antimalarial targets in the human purine binding proteome. Mol. Pharm. 2002, 62, 1364-1372. 7. Thomas, J. R.; Hergenrother, P. J. Targeting RNA with small molecules. Chem. Rev. 2008, 108, 1171-1224. 8. Costa, M.; Deme, E.; Jacquier, A.; Michel, F. Multiple tertiary interactions involving domain II of group II self-splicing introns. J. Mol. Biol. 1997, 267, 520-536. 9. Hedenstierna, K. O. F.; Siefert, J. L.; Fox, G. E.; Murgola, E. J. Co-conservation of rRNA tetraloop sequences and helix length suggests involvement of the tetraloops in higher-order interactions. Biochimie 2000, 82, 221-227.

42

10. Be´langer, F. G., M. G.; Steinberg, S. V.; Cunningham, P. R.; Brakier-Gingras, L. Study of the functional interaction of the 900 tetraloop of 16S ribosomal RNA with helix 24 within the bacterial ribosome. J. Mol. Biol. 2004, 338, 683-693. 11. Krummel, D. A. P.; Altman, S. Verification of phylogenetic predictions in vivo and the importance of the tetraloop motif in a catalytic RNA. Proc. Natl. Acad. Sci. U.S.A. 1999, 96, 11200-11205. 12. Varani, G. Exceptionally stable nucleic acid hairpins. Annu. Rev. Biophys. Biomol. Struct. 1995, 24, 379-404. 13. Ikawa, Y.; Naito, D.; Aono, N.; Shiraishi, H.; Inoue, T. A conserved motif in group IC3 introns is a new class of GNRA receptor. Nucleic Acids Res. 1999, 27, 1859-1865. 14. Klosterman, P. S.; Hendrix, D. K.; Tamura, M.; Holbrook, S. R.; Brenner, S. E. Three dimensional motifs from the SCOR, structural classification of RNA database: extruded strands, base triples, tetraloops and U-turns. Nucleic Acids Res. 2004, 32, 2342-2352. 15. Woese, C. R.; Gutell, R.; Gupta, R.; Noller, H. F. Detailed analysis of the higher-order structure of 16S-like ribosomal ribonucleic-acids. Microbiol. Rev. 1983, 47, 621-669. 16. Woese, C. R.; Winker, S.; Gutell, R. R. Architecture of ribosomal RNA: constraints on the sequence of “tetra-loops”. Proc. Natl. Acad. Sci. U.S.A. 1990, 87, 8467-8471. 17. Klosterman, P. S.; Hendrix, D. K.; Tamura, M.; Holbrook, S. R.; Brenner, S. E. Three dimensional motifs from the SCOR, structural classification of RNA database: extruded strands, base triples, tetraloops and U-turns. Nucleic Acids Res. 2004, 32, 2342-2352. 18. Tamura, M.; Hendrix, D. K.; Klosterman, P. S.; Schimmelman, N. R. B.; Brenner, S. E.; Holbrook, S. R. The structural classification of RNA. http://scor.berkeley.edu

43

19. Antao, V. P.; Lai, S. Y.; Tinoco, I. Jr. A thermodynamic study of unusually stable RNA and DNA hairpins. Nucleic Acids Res. 1991, 19, 5901-5905. 20. Antao, V. P.; Tinoco, I. Jr. Thermodynamic parameters for loop formation in RNA and DNA hairpin tetraloops. Nucleic Acids Res. 1992, 20, 819-824. 21. Costa, M.; Michel, F. Frequent use of the same tertiary motif by self-folding RNAs. EMBO J. 1995, 14, 1276-1285. 22. Ferna´ndez-miragall, O.; Martı´nez-salas, E. Structural organization of a viral IRES depends on the integrity of the GNRA motif. RNA 2003, 9, 1333-1344. 23. Psaridi, L.; Georgopoulou, U.; Varaklioti, A.; Mavromara, P. Mutational analysis of a conserved tetraloop in the 5' untranslated region of hepatitis C virus identifies a novel RNA element essential for the internal ribosome entry site function. FEBS Lett. 1999, 453, 49-53. 24. Jagath, J. R.; Matassova, N. B.; De Leeuw, E.; Warnecke, J. M.; Lentzen, G.; Rodnina, M. V.; Luirink, J.; Wintermeyer, W. Important role of the tetraloop region of 4.5S RNA in SRP binding to its receptor FtsY. RNA 2001, 7, 293-301. 25. Zwieb, C. Conformity of RNAs that interact with tetranucleotide loop binding proteins. Nucleic Acids Res. 1992, 20, 4397-4400. 26. Siu, F. Y.; Spanggord, R. J.; Doudna, J. A. SRP RNA provides the physiologically essential GTPase activation function in cotranslational protein targeting. RNA 2007, 13, 240-250. 27. Glu¨ck, A.; Endo, Y.; Wool, I. G. J. Ribosomal RNA identity elements for ricin a-chain recognition and catalysis-analysis with tetraloop mutants. Mol. Biol. 1992, 226, 411-424. 28. Martin, J. S.; Matthew, H. L.; Theresa, J. A.; Calvin, A. S. RNA hairpin loop stability depends on closing base pair. Nucleic Acids Res. 1993, 21, 3845-3849.

44

29. Romaniuk, P. J.; Hughes, D. W.; Gregoire, R. J.; Neilson, T.; Bell, R. A. Stabilizing effect of dangling bases on a short RNA double helix as determined by proton nuclear magnetic resonance spectroscopy. J. Am. Chem. Soc. 1978, 100, 3971-3972. 30. Sugimoto, N.; Kierzek, R.; Turner, D. H. Sequence dependence for the energetics of dangling ends and terminal base pairs in ribonucleic acid. Biochemistry 1987, 26, 4554-4558. 31. Isaksson, J.; Chattopadhyaya, J. A uniform mechanism correlating dangling-end stabilization and stacking geometry. Biochemistry 2005, 44, 5390-5401. 32. Freier, S. M.; Burger, B. J.; Alkema, D.; Neilson, T.; Turner, D. Effects of 3' dangling end stacking on the stability of GGCC and CCGG double helices. Biochemistry 1983, 22, 6198-6206. 33. Burkard, M. E.; Kierzek, R.; Turner, D. H. Thermodynamics of unpaired terminal nucleotides on short RNA helixes correlates with stacking at helix termini in larger RNAs. J. Mol. Biol. 1999, 290, 967-982. 34. Ohmichi, T.; Nakano, S.-i.; Miyoshi, D.; Sugimoto, N. Long RNA dangling end has large energetic contribution to duplex stability. J. Am. Chem. Soc. 2002, 124, 10367-10372. 35. Limmer, S.; Hofmann, H.-P.; Ott, G.; Sprinzl, M. The 3'-terminal end (NCCA) of tRNA determines the structure and the stability of the aminoacyl acceptor stem. Proc. Natl. Acad. Sci. USA 1993, 90, 6199-6202. 36. Elbashir, S. M.; Harborth, J.; Lendeckel, W.; Yalcin, A.; Weber, K.; Tuschl, T. Duplexes of 21-nucleotide RNAs mediate RNA interference in cultured mammalian cells. Nature 2001, 411, 494-498. 37. Ayer, D.; Yarus, M. The context effect does not require a fourth base pair. Science 1986, 231, 393-394.

45

38. Ohmichi, T.; Karimata, H.; Sugimoto, N. Effect of secondary structure of short doublestranded RNA on RNAi efficiency. Nucleic Acids Res. 2002, 2, 63-64. 39. Elbashir, S. M.; Lendeckel, W.; Tuschl, T. RNA interference is mediated by 21- and 22nucleotide RNAs. Gene & Development 2001, 15, 188-200. 40. Yan, Z.; Ramisetty S. R.; Bolton, P. H.; Baranger A. M. Selective recognition of RNA helices containing dangling ends by a quinoline derivative. ChemBioChem. 2007, 8, 1658-1661. 41. Yan, Z.; Baranger, A. M. Binding of aminoacridine derivative to a tetraloop RNA. Bioorg. Med. Chem. Lett. 2004, 14, 5889-5893. 42. Yan, Z.; Sikri, S.; Beveridge, D. L.; Baranger A. M. Identification of an aminoacridine derivative that binds to RNA tetraloops J. Med. Chem. 2007, 50, 4096-4104. 43. Heus, H. A.; Pardi, A. Structural features that give rise to the unusual stability of RNA hairpins containing GNRA loops. Science 1991, 253, 191-194. 44. Lipinski, C. A.; Lombardo, F.; Dominy, B. W.; Feeney, P. J. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv. Drug Del. Rev. 2001, 46, 3-26. 45. SantaLucia, J.; Kierzek, R.; Turner, D. H. Context dependence of hydrogen bond free energy revealed by substitutions in an RNA hairpin. Science 1992, 256, 217-219. 46. Wo¨rner, K.; Strube, T.; Engels, J. W. Synthesis and stability of GNRA-loop analogs. HelV. Chim. Acta 1999, 82, 2094-2104. 47. Leuilliot, N.; Baumruk, V.; Abdelkafi, M.; Turpin, P.-Y.; Namane, A.; Gouyette, C.; HuynhDinh, T.; Ghomi, M. Unusual nucleotide conformations in GNRA and UNCG type tetraloop hairpins: evidence from Raman markers assignments. Nucleic Acids Res. 1999, 27, 1398-1404.

46

48. Morris, G. M.; Goodsell, D. S.; Halliday, R. S.; Huey, R.; Hart, W. E.; Belew, R. K.; Olson, A. J. Automated docking using a Lamarckian genetic algorithm and empirical binding free energy function. J. Comput. Chem. 1998, 19, 1639-1662. 49. Warui, D. M.; Baranger, A. M. Identification of specific small molecule ligands for stem loop 3 ribonucleic acid of the packaging signal Ψ of human immunodeficiency virus-1. J. Med. Chem. 2009, 52, 5462-5473. 50. Seidler, J.; McGovern, S. L.; Doman, T. N.; Shoichet, B. K. Identification and prediction of promiscuous aggregating inhibitors among known drugs. J. Med. Chem. 2003, 46, 4477-4486. 51. McGovern, S. L.; Helfand, B. T.; Feng, B.; Shoichet, B. K. A specific mechanism of nonspecific inhibition. J. Med. Chem. 2003, 46, 4265-4272. 52. Jusuf, S.; Loll, P. J.; Axeisen, P. H. The role of configurational entropy in biochemical cooperativity. J. Am. Chem. Soc. 2002, 124, 3490-3491. 53. Wiseman, T.; Williston, S.; Brandts, J. F.; Lin, L-N. Rapid measurement of binding constants and heats of binding using a new titration calorimeter. Anal. Biochem. 1989, 179, 131-137. 54. Muralidhara, B. K.; Negi, S. S.; Halpert, J. R. Dissecting the thermodynamics and cooperativity of ligand binding in cytochrome P450eryF. J. Am. Chem. Soc. 2007, 129, 20152024.

47

CHAPTER 3 Unraveling the Interaction of Poly(CUG)RNA with the MBNL1 Protein 3. 1 Significance of RNA-protein interactions RNA-protein interactions are important in biological events that perform multiple functions in all living organisms. RNA often serves as a mediator of genetic information between DNA and protein in eukaryotes and prokaryotes. In addition, RNA serves as the genetic material in some viruses. Based on the function or localization RNAs are classified into transfer RNA (tRNA), messenger RNA (mRNA), ribosomal RNA (rRNA), viral RNA (vRNA), small nuclear RNA (snRNA) and small cytoplasmic RNA (scRNA). These RNA form complexes with a wide variety of RNA-binding proteins that stabilize, protect, package, or transport RNA, mediate RNA interactions with other macromolecules, and also act catalytically on RNA (cutting, unwinding, modifying, replicating etc.).1 The fundamental roles of RNAs and RNPs (ribonuclear protein complexes) are shown in the following diagram (Figure 3-1).2 The modulation of RNA-protein or RNA-RNA interactions involved in transcription, mRNA processing, protein translation, or retroviral gene expression could provide new drugs to treat cancer, combat bacterial and viral infections. Targeting RNAs and RNPs with small molecules could also induce effects which are not achievable through targeting proteins. Interestingly, many of the currently available drug targets are membrane bound receptors3-4 and it is difficult to study these proteins using standard techniques and equally challenging to integrate into high-throughput screens. So the effect of small molecule on the receptor cannot be assayed or quantified easily. In such a case, the modulation of mRNA transcript that codes for the receptor could be alternative way for finding

48

an effective treatment. The development in the fields of structural studies of RNA-protein complexes and their biological functions, RNAs and RNA-protein complexes are becoming increasingly attractive targets for drug design.

Figure 3-1. Representation of the diverse fundamental roles of RNAs and RNPs in the maintenance and transfer of biological information from DNA to protein. RNAs and RNP involved in DNA replication (chromosome maintenance and DNA synthesis); transcriptional control (mRNA processing); protein translation; and retroviral reverse transcription, gene expression and packaging. All these important process controlled by RNA and RNPs make them potential drug targets for the development of new drugs.

3. 2 RNA-protein recognition Most functions of RNA require interactions with RNA-binding proteins. The knowledge of RNA-protein recognition is informative for designing small molecules targeting RNA. Because, proteins may use similar strategies for binding to RNA and protein-binding sites are potential targets for drug action. A prominent example of recognition of the same region in RNA by a protein and a small molecule is binding of the viral Rev protein as well as neomycin B binding to a non-Watson-Crick G-G base pair with in the HIV-RRE RNA.5-6 Most RNAs occur as partially folded single-stranded molecules. The A form helix of RNA has a narrow deep major groove, which buries the edges of the base and the readily

49

accessible shallow minor groove contains less information for recognition.7 The distortions in regular A form helix by non-Watson-Crick base pairs forms looped out residues and platforms of consecutive nucleotides with widened deep groove.1, 8-9 Proteins not only recognize but may also enhance such distortions of RNA structure. Insertion of an α-helix or a flexible protein loop into a widened deep groove has been observed in several protein-RNA complexes.8, 10 Exposed βstrands of proteins can interact with unpaired RNA regions by stacking the splayed-out bases with aromatic amino acid side chains.1,

11

Flexible protein loops participate predominantly in

interactions with the RNA backbone, often leading to reduced loop flexibility.1,

10

There are

remarkable differences between RNA-protein and RNA-peptide complexes.12-13 In RNA-protein complexes the large proteins predominantly bind RNAs as rigidly folded domains which provide exposed surfaces, cavities, clefts for the RNA substrate, where as in RNA-peptide complexes, these changes are observed very frequently. The negatively charged phosphate backbone of RNA interacts with protein side chains through hydrogen bonding, which is seen in 90% of RNA-protein complexes.14 In complexes of known structure, 20% of intermolecular interactions involve the 2’-OH group equally often as a hydrogen bond donor and acceptor. The basic amino acids arginine and lysine participate in 60% of intermolecular H-bonds and the flat guanidinium side chain of arginine forms stacking interactions with RNA.14 The stacking interactions involve both RNA and protein residues in alternating fashion, often called interdigitation,8 and the hydrophobic parts of RNA bases align with the nonpolar side chains of proteins. The positions 6 and 7 of purine and position 4 of pyrimidine form hydrogen bonds with proteins. The RNA folds created due to non-Watson-Crick base pairing also provide excellent recognition sites for proteins.15 Water and ions also play important roles in maintaining the three dimensional structure of RNA-protein complexes. Water

50

molecules are often found at the binding interfaces where they extend shape complementarity by filling cavities.

3. 3 RNA binding proteins in human diseases RNA-binding proteins are widely involved in the posttranscriptional processing of RNAs, play key roles in exon-intron splicing, nuclear export, polyadenylation, subcellular localization, translationlcontrol, stabilization/degradation and sequence editing.16 In particular, the expression of RNA-binding proteins restricted to specific cell types perform important functions by controlling the subcellular distribution of proteins, levels of proteins, the production of certain protein isoforms and therefore, confer distinct phenotypes to cells. The perturbation of normal functions of RNA-binding proteins has been implicated in many clinical disorders. RNA-binding proteins implicated in diseases include the CELF proteins (e.g., CUG-BP), which are believed to play roles in normal heart and skeletal muscles development and in the pathology of myotonic dystrophy, the Nova autoimmune antigens, which are neuron-specific proteins involved in the pathogenesis of the neurodegenerative syndrome paraneoplastic opsoclonus-myoclonus ataxia (POMA), and the αCP proteins, which have been suggested to cause α-thalassemia.17 The mechanism by which these proteins control the expression of proteins in their respective cell types is still unknown. The investigation of relationship between the structure and function of these proteins will begin to reveal how they help to shape the protein expression programs unique to skeletal muscle, heart, brain and other tissues.

3. 4 RNA dominant diseases The fraction of the human genome transcribed far exceeds the fraction that encodes protein.18 Mutations in non-protein-coding regions give rise to deleterious gain-of-function by non-coding RNA similar to the function of mutated proteins. This RNA gain of function often

51

involves repetitive sequences in non-coding RNA. The repetitive sequences in non coding regions causes various diseases including, myotonic dystrophy, fragile X tremor ataxia syndrome (FXTAS), spinocerebellar ataxia type 8 (SCA8), SCA10, SCA12, and Huntington's disease-like 2 (HDL2) (Figure 3-2).19 In 1991, the first two triplet repeat expansion disorders discovered revealed mutations in coding and non-coding regions. The unstable CGG repeat expansion in the noncoding 5’UTR (untranslated region) of FMR1 causes fragile X syndrome20-21 whereas spinobulbular muscular atrophy is associated with unstable CAG repeat expansions in the coding region of AR (androgen receptor) gene.22 Later, in 1992, a third trinucleotide repeat disorder was discovered, which is caused by a CTG repeat expansion in the noncoding 3’UTR of DMPK (Dystrophyia Myotonica Protein Kinase) gene.23-27 The noncoding triplet repeat expansion diseases SCA8 and Friedreich ataxia are caused by tandem CTG and GAA repeat units, respectively.28-29 SCA8 was the first disease predicted to be caused by a noncoding RNA repeat sequence but studies on myotonic dystrophy unveiled the critical role of RNA in contributing to disease phenotype.

Figure 3-2. Schematic diagram showing the position and relative sizes of disease associated microsatellite repeat expansions located in the non-coding regions of respective genes. The black

52

line (left) represents promoter and intron (right); blue boxes represent 5’ and 3’ untranslated regions; solid red boxes protein coding regions.

3. 5 Myotonic dystrophies Steinert and Batten (1909) identified myotonic dystrophy as a multisystemic disorder that is now recognized as one of the most common forms of muscular dystrophy in adults.30 The geneticists and clinical neurologists identified non-Mendelian features of myotonic dystrophy inheritance, including variable penetrance, anticipation (a tendency for the disease to worsen in subsequent generations), and a maternal bias transmission for congenital forms.31 In 1992, the generic cause for the disease was identified as the expansion of an unstable CTG-repeat in the 3’ntranslated region of a gene encoding DMPK gene23-27 and the disease was called as myotonic dystrophy type 1 (DM1). In 1995, several reports described families with dominantly inherited multisystemic myotonic disorders that were genetically distinct from DM1,32-34 and later in 1998 a CCTG-repeat expansion in intron 1 of ZNF9 (zinc finger protein 9) gene was identified as the cause of the disease and referred to as a second form of myotonic dystrophy (DM2). DM2 disease has also been called as proximal myotonic myopathy (PROMM) or proximal myotonic dystrophy.

Figure 3-3. Schematic illustration of the location of DM1 and DM2 gene loci and the insertion of (CUG)n and (CCUG)n A form of duplex RNA from (CTG)n and (CCTG)n DNA repeats, respectively.

53

3. 6 Molecular genetics of DM1 and DM2 DM1 is more common than DM2, accounts for approximately 98% of DM cases, and affects 1 in 8000 people worldwide.31 The symptoms of the disease are similar for DM1 and DM2. However a key difference between DM1 and DM2 is that only the DM1 locus presents a congenital form of this disorder but not in DM2. The DMPK gene located on chromosome 19q13.3 and ZNF9 is on chromosome 3q21 (Figure 3-3). In DM1 the CTG expansions vary from 50 to 4,000 repeats in affected patients, with clinically unaffected patients having up to 50 CTGs. Whereas in DM2 the CCTG expansion in pathological condition is around 75 to 11,000 repeats. Somatic instability has been reported in different patients and the elongation in DM1 and DM2 are 50 to 80 and 700 repeats per year, respectively. There is a rough correlation between DM1 repeat size and age of onset for CTGs < 400 but poor correlation between repeat length and disease severity for long repeats.35-36 Table 3-1. Etiology of Myotonic dystrophies

Type

DM1

DM2

Gene Mechanism Normal repeat size Pathologic repeat size Expanded repeat range

DMPK CTG repeat Up to 37 >50 CTG 50-4000

ZNF9 CCTG repeat Up to 27 >75 CTG 75->11000

3. 7 Clinical features of the myotonic dystrophies Myotonic dystrophies (DMs) are autosomal dominant, multisystemic disorders with symptoms including myotonia (muscle hyperexcitability), progressive muscle weakness and wasting, cataract development, testicular atrophy and cardiac conduction defects.37 In the initial stages of DM1 and DM2 the pattern of muscle weakness is noticeably different (distal vs proximal) but the muscle biopsies show a similar histology with increased muscle fiber size and

54

central nucleation. The cataract usually develops in the second decade or later and is characterized by mutlicolored lens opacities and is the most easily treatable among all the symptoms caused by this disease. The other common symptoms are testicular atrophy, insulin insensitivity, frontal balding, hypogammaglobulinemia (reduced IgG and IgM serum levels), lethal arrhythmias and occasional sign of cardiomyopathy.36 Clinical Features of Myotonic Dystrophy

Figure 3-4. Schematic representation of clinical features of myotonic dystrophy which are common in DM1, DM2 and specific for DM1 and DM2 types. The CCTG expansion repeat size in DM2 is longer than the CTG repeat in DM1 but the symptoms in DM2 are less severe than in DM1. Most of the clinical features of DM2 appear in adulthood (median age 48 years)38 as opposed to DM1 that clearly demonstrates adult-onset, childhood-onset and congenital forms with corresponding increasing disease severity and repeat size. The neonatal symptoms of congenital DM1 do not include some of the features characteristic of adult onset DM1 and DM2, such as cataract development, myotonia and myopathy.31Instead, congenital DM1 is associated with hypotonia, mental retardation, facial

55

diplegia and a maternal bias in DM transmission (Figure 3-4). Although there are a few reports of childhood-onset in DM2, they are not usually associated with a developmental disease of the central nervous system as in congenital and childhood-onset DM1. The patients who survive with congenital and childhood onset eventually manifest the hallmark features of adult-onset form of disorder.

3. 8 RNA pathogenesis in myotonic dystrophy The current hypothesis of the molecular basis of DM1 and DM2 is that the unstable expanded repeats cause misregulation of pre-mRNA splicing. Three distinct models have been proposed to explain the molecular mechanisms of DM pathogenesis. (1) Haploinsufficiency of the DMPK gene, (2) Altered expression of neighboring gene, and (3) dominant-negative mRNA expression.36-37 The first two models have been shown to be wrong with mice knockout experiments. The third model promotes the hypothesis that dominant-negative pathogenic effects of RNA containing the CUG and CCUG expansions cause DM. The transcripts with CUG and CCUG expansion repeats accumulate as ribonuclear inclusions (RNA foci) in the nuclei of DM cells and are detectable by in situ hybridization.39-44 These poly(CUG)RNA and poly(CCUG)RNA alter the RNA binding activity of some proteins including the CUG-binding protein (-BP), CCUG-BP and three different forms of muscle blind like proteins (MBNL1, MBLL, and MBXL) (Figure 3-5). The CUG-BP is localized in both nucleus and cytoplasm and binds specifically to UG repeat sequences but not to ribonuclear foci. The levels of CUG-BP and CCUG-BP seem to be increased in several DM cells and the specific mechanism for upregulation is still not known. The muscleblind-like proteins colocalize with ploy(CUG)RNA and poly(CCUG)RNA in DM cells and are proposed to induce pathogenesis by at least

56

three

mechanisms: (1) misregulation of pre-mRNA splicing, (2) interference with muscle differentiation, (3) transcriptional interference.45

Figure 3-5. The RNA gain-of-function model of myotonic dystrophy pathogenesis. The left side of the figure represents the free form of MBNL1 which performs its normal pre-mRNA splicing in a healthy individual. The right side of the figure represents the condition in the disease state. The expanded poly(CUG)RNA or poly(CCUG)RNA form an imperfect double stranded structure. This affects regulation of splicing by sequestering the important splicing regulator MBNL1 protein.

3. 9 Misregulation of alternative splicing Alternative splicing is a process by which multiple mRNA isoforms are generated from individual genes. This gives rise to protein isoforms that significantly differ in their activity.46 Alternative splicing involves the binding of regulatory factors to intronic or exonic elements and is regulated according to cell type or developmental stages. It has a huge impact on multiple aspects of cell and tissue physiology.47 Misregulation of alternative splicing has been implicated in many human diseases.47-48 Misregulated alternative splicing events have been identified in the heart, skeletal muscle, and central nervous system of DM1 patients and are summarized in Table 3-2. This misregulation affects only a subset of genes, indicating that most genes are unaffected by DM1.49 Interestingly, all pre-mRNAs misregulated in DM1 normally undergo a developmentally regulated splicing switch. In DM1, the embryonic or fetal splicing patterns of these genes are retained in adult tissues. Misexpression of the early developmental isoforms for

57

IR and CLCN-1 has been shown to directly correlate with the disease symptoms such as insulin resistance and myotonia, respectively (Figure 3-6).50-53

Figure 3-6. Misregulation of alternative splicing of IR and ClC1 pre-mRNAs in DM skeletal muscle. The fetal isoforms of IR and ClC1 predominates in adult tissues (shaded boxes). The fetal splicing pattern of IR pre-mRNA results in expression of a fetal isoform with exon-11 exclusion having lower receptor activation. Whereas inCLC1 the exon 7a containing a premature termination codon inclusion results in nonsense-mediated decay (NMD) degradation.53 The specific mechanism by which poly(CUG)RNA or poly(CCUG) RNA induces splicing misregulation is unclear. However, there is substantial evidence for the misregulation of splicing events by two RNA binding protein families: CUG-BP and ETR-3-Like factors (CELF) and MBNL proteins. The MBNL proteins have been shown to bind to RNA repeats and directly regulate alternative splicing of multiple pre-mRNAs including several that undergo misregulated alternative splicing in DM.51-55 Interestingly, CELF and MBNL proteins have been shown to act antagonistically in the misregulation of the splicing of two pre-mRNAs, TNNT2 (cardiac troponin T) and IR (Insulin receptor) in DM cells. The reduced MBNL activity is due to

58

sequestration by RNA repeats and increased CELF activity are determinative factors for misregulation in DM. Table 3-2. The list of pre-mRNAs misregulated during DM pathogenesis.45

Pre-mRNA

Mis-regulated exon/intron

Cardiac troponin T (TNNT2 or cTNT) Insulin receptor (IR) Chloride channel (CLCN-1) Ryanodine receptor Microtubule-associated Protein tau (MAPT) Myotubularin-related protein 1 MTMR1 Fast skeletal troponin (TNNT3) N-methyl-D-aspartate (NMDAR1) Amyloid precursor protein (APP)

Exon 5 Exon 11 Intron 2 and exon 7a Exon 70 Exon 2 and 10 Exon 2.1 and 2.3 Fetal exon Exon 5 Exon 7

3. 10 Research aims My research project was to study the interaction between the MBNL1 protein and poly(CUG)RNA and the inhibition of the complex formed between MBNL1 and poly(CUG)RNA and poly(CCUG)RNA by small molecules. Important residues in the MBNL1 protein were mutated to alanine by site directed mutagenesis and truncated versions of protein were expressed by molecular cloning approach to unravel the MBNL1 interaction with poly(CUG)RNA. The mutated and truncated proteins binding with pathogenic (CUG)12 repeat and natural target (S10 RNA) was investigated by EMSA (electrophoretic mobility shift assay). The small molecules inhibition of poly(CUG)RNA-MBNL1 and Poly(CCUG)RNA-MBNL1

59

complexes was studied by EMSA and the binding of small molecules to RNA repeats was investigated by isothermal titration calorimetry experiments.

3.11 Introduction Poly(CUG)RNA and poly(CCUG)RNA sequester MBNL family proteins which cause, misregulation of alternative splicing of 10 pre-mRNAs of which at least two (insulin receptor, muscle specific chloride channel (CLCN-1)) are directly related to disease symptoms.50-52 The three human MBNL paralogues, MBNL1/EXP, MBNL2/MBLL/MLP1 and MBNL3/MBXL are homologues of Drosophila muscleblind (mbl), which is required for Drosophila photoreceptor and muscle differentiation.56-57 Among the three MBNL proteins MBNL1 and MBNL2 are more abundant and have been shown to colocalize with CUG and CCUG repeats in the nucleus, forming nuclear foci in both DM1 and DM2.41, 49, 58-60 MBNL1 and MBNL2 are expressed in heart and skeletal muscle, two tissues prominently effected in DM.61 MBNL3 expression is restricted to placenta in adult mice and is widely expressed in the embryo.41, 58 All three proteins have almost identical protein sequences and appear to similarly regulate alternative splicing in tissue culture.54,

62-63

Each of these three proteins contain four CCCH zinc finger domains,

positioned as tandem pairs at the N-terminal end (ZnF1 and ZnF2) and the middle (ZnF3 and ZnF4) of the protein (Figure 3-7 (a)). In ZnF1 and ZnF3 domains the cysteines are found in CX7CX6CX3H and in ZnF2 and ZnF4, the cysteines are found in CX7CX4CX3H sequences. Each tandem segment of vertebrate proteins are 99% identical and the most distantly related muscleblind proteins having only one ZnF pair share 67% identity with human MBNL1.64 The CCCH type zinc finger domain is also found in other RNA binding proteins65-68 including tristetraprolin (TTP), butyrate response factor, (CX8CX5CX3H sequence), which are involved in destabilization of mRNA, and U2AF (CX8CX5CX3H and CX7CX5CX3H sequences), which

60

participates in alternative splicing. The tandem CCCH domain interaction with RNA has been well studied for the TIS11D protein, a member of the TTP family. The recognition of single stranded 5’-UUAUUUAUU-3’ sequence by the CX8CX5CX3H domain of TIS11D was characterized

by NMR. Each of the two domains recognize a UAUU repeat using stacking and hydrogen bonding interactions.69

(a)

(b)

(c) GPLGSMAVSVTPIRDTKWLTLEVCREFQRGTCSRPDTECKFAHPSKSCQVENGRVIACFDSLKG RCSRENCKYLHPPPHLKTQLEINGRNNLIQQKNMAMLAQQMQLANAMMPGAPLQPVPMFSVAPS LATNASAAAFNPYLGPVSPSLVPAEILPTAPMLVTGNPGVPVPAAAAAAAQKLMRTDRLEVCRE YQRGNCNRGENDCRFAHPADSTMIDTNDNTVTVCMDYIKGRCSREKCKYFHPPAHLQAKIKAAQ YQLEHHHHHHERPHRD Figure 3-7. (a). Schematic representation of human MBNL1 full length protein with the four zinc finger domains identified. The numbers indicate the length of the linker segments. (b) Alignment of the amino acid sequences corresponding to truncated versions of MBNL1 (Genbank ID: NM_021038), MBNL2 (Genbank ID: NM_144778) and MBNL3 (Genbank ID: NM_018388) containing all four zinc finger domains. The alignment was performed using the program MultAlin. Identical amino acids are shown in red. (c). Amino acid sequence of the MBNL1N (truncated form) used in our experimental studies. The sequences that are high-lighted in red and green represent the zinc finger domains 1 to 4 and hexahistag, respectively. Human full length MBNL1 is approximately 380 amino acids long with tandem zinc finger motifs located at the N-terminus and in the middle of the protein. Kinko et al. and Warf et al. previously reported the binding of MBNLl (1-260) and MBNL1 (1-382) protein to poly(CUG) repeats and found that the truncated protein binds similarly to the full length protein.

61

These results confirmed that the C-terminal end of the protein is not important for recognition of CUG repeats.70-71 So, all experiments reported here were performed with the truncated protein containing all four zinc fingers and a His-tag at the C-terminus of the protein as shown in Figure 3-7 (c).

Figure 3-8. RNA structures used for experimental studies. The (CUG)4 and (CUG)12 are examples of poly(CUG)RNAs (DM1 targets) and (CCUG)6 is an example of poly(CCUG)RNA (DM2 target). The cTNT and S10 are the fragments of cardiac troponin T pre-mRNA, which is a natural target for MBNL1 protein. The sequence shown in blue box is exactly same as the sequence of S10 RNA. We have investigated the interaction of MBNL1N protein with poly(CUG) repeats and cardiac troponin T (cTNT) pre-mRNA, which is one of the natural targets of MBNL1 protein. MBNL1N binds to the 3’ end of the fourth intron of cTNT pre-mRNA to control splicing. The S10 RNA is a truncated version of cTNT RNA (Figure 3-8). The specificity of MBNL1N was investigated by studying its interaction with (CUG)4 and (CUG)12 poly(CUG)RNA repeats in the presence and absence of yeast tRNA. We used alanine scanning to identify amino acids of MBNL1N that significantly contribute to interactions with pathogenic RNA and S10 RNA. We

62

also investigated the ability of individual zinc finger domains in MBNL1N to bind to poly(CUG)RNA and S10 RNA. This study is interesting because the structure of the MBNL1Npoly(CUG)RNA complex has not been solved and no studies have been performed to investigate the roles of amino acids involved in interactions with poly(CUG)RNA.

3.12 Results and discussion The (CUG)4 and (CUG)12 RNAs were selected for studying the interaction of MBNL1N with poly(CUG)RNA. Previously, Berglund and co-workers determined the MBNL1N binding to (CUG)12 and (CUG)90 RNAs with similar affinity. Therefore, for our studies we selected shorter CUG repeats. The (CUG)4 is stabilized by an ultrastable UUCG tetraloop and the (CUG)12 maintains the structure by stable CG stem region. The (CCUG)6 was selected as an example of poly(CCUG)RNA and is stabilized by long stem (GCGG). To study the effect of mutated proteins and truncated MBNL1N with natural targets, we chose cTNT RNA and S10 RNA (truncated version of cTNT RNA). As the C-terminus of MBNL1 protein is not involved in the recognition of RNA targets, we used the truncated version (MBNL1N) of full length protein containing 260 amino acids without C-terminal end.

3.12.1 MBNL1N interaction with poly(CUG)RNA repeats and natural target RNAs The binding assays were performed by electrophoretic mobility shift assays using a 6% native polyacrylamide gel at 4 0C in 0.5 X Tris-Borate buffer (pH 8). The interaction of MBNL1N with (CUG)4 and (CUG)12 repeats was studied in the presence and absence of yeast tRNA. The protein recognizes (CUG)4 and (CUG)12 with binding affinities of 26 ± 4 nM and 165 ± 9 nM, respectively. The dissociation constant (KD) of MBNL1N with (CUG)12 RNA agree with the previously reported value. The binding to (CUG)4 RNA is 5-fold better than to

63

(CUG)12, which may be due to the unstable secondary structure of (CUG)4 RNA. The secondary structure prediction analysis for (CUG)4 RNA by the program mFold suggests that it is not forming the same structure as shown in the Figure 3-8. MBNL1N prefers binding to single stranded RNA over proper duplex structure. The affinity of MBNL1N with (CCUG)6 RNA is 100-fold better than (CUG)12 RNA, which is almost similar to binding to its natural target cTNT RNA. The binding affinities were decreased in the presence of tRNA and are proportional to the concentration of tRNA (Figures 3-9 and 3-10; Table 3-3). These results suggest that MBNL1N is not specific for poly(CUG)RNA and it may bind to other RNA targets. The MBNL1N is binding with the same affinity to cTNT and S10 RNAs. This indicates that the protein is not binding to the region located below the blue box in the 32 mer cTNT RNA (Figure 3-8).

(a)

(b)

Figure 3-9. The representative binding curves for binding of MBNL1N protein with (CUG)4 (a) and (CUG)12 RNA (b). The red, blue, and green lines correspond to binding in the absence, lower concentration (0.2 µM and 0.9 µM for (CUG)4 and (CUG)12 RNA, respectively), and higher concentration (0.4 µM and 1.8 µM for (CUG)4 and (CUG)12 RNA, respectively) of tRNA, respectively.

64

Figure 3-10. Representative electrophoretic mobility shift assays of MBNL1N protein binding to poly(CUG)RNAs in the presence and absence of tRNA and also with cTNT and S10 RNAs. A 2 fold serial dilutions of the protein were incubated with 32P-labeled RNAs at room temperature for 20 min. Electrophoresis was performed using a 6% native polyacrylamide gel at 4 0C in 0.5 X Tris-Borate buffer (pH=8). The top and lower gel pictures represent the binding of MBNL1N protein in absence of tRNA. The first lane from the left represents free RNA band and the slower running band in the other lanes corresponds to the complex. The concentrations shown on the blue bar indicate lowest and highest concentrations of protein used for the experiment.

65

Table 3-3. The summary of the data for MBNL1N protein binding with poly(CUG)RNAs and natural targets cTNT and S10 RNAs. RNA (CUG)4 (CUG)12 cTNT S10

KD (nM) (Without tRNA) 26 ± 4 165 ± 9 1.4 ± 0.8 1.3 ± 0.2

KD (nM) (With tRNA) 69 ± 10 (0.2 µM tRNA) 370 ± 20 (0.9 µM RNA)

KD (nM) (With tRNA) 92 (0.4 µM tRNA) 767 (1.8 µM tRNA)

3.12.2 Identification of amino acids in MBNL1N that are important for binding to RNA An understanding of the binding interaction between MBNL1N protein and the pathogenic RNA sequences will be beneficial for the rational design of small molecules to inhibit the formation of the MBNL1N-RNA complexes. Currently, the binding site and the protein sequences necessary for binding of MBNL1N protein with poly (CUG)RNA have not been wellcharacterized. Recently, Teplova and co-workers published a crystal structure of the complex formed between the truncated MBNL1N protein having only zinc finger 3 and 4 domains and CGUUGC RNA64 but no studies have been completed with poly(CUG)RNA. Sequence comparison of muscleblind from different species revealed the presence of two protein motifs other than the CCCH zinc finger domain. The first motif is a Lev box located before the first zinger and has the consensus sequence WLXLEV, where X indicates any amino acid. The second motif is located before and after the zinc finger 2 and consists of a core NGR sequence immediately followed by either a valine or an asparagine residue, with two additional highly conserved polar residues before and after the NGR core.72

66

Figure 3-11. Representative electrophoretic mobility shift assays of MBNL1N mutant proteins binding to (CUG)12 repeat and S10 RNAs. The numbers 1, 2, 3, 4, 5, and 6 correspond to the Phe22Ala, Phe36Ala, Phe54Ala, Ser56Ala, Tyr68Ala, and Tyr188Ala mutants. The protein samples were 2 fold serially diluted, incubated with 32P-labeled RNAs at room temperature for 20 min and the electrophoresis was performed using a 6% native polyacrylamide gel at 4 0C in

67

0.5 X Tris-Borate buffer (pH=8). The first lane from the left represents the free RNA band and the top band in the other lanes corresponds to complex. The concentrations shown on the blue bar indicate lowest and highest concentrations of protein used for the experiment. The other commonly seen structural element in all muscleblind proteins is the conservation of four aromatic residues in the first zinc finger pair as well as tryptophan within the LEV box (Figure 3-12). The tandem CCCH zinc finger domain of the TIS11d protein (a member of the TTP family of proteins) binds ARE (AU-rich element) sequences by intercalating a tyrosine and phenylalanine side chain between AU and UU dinucleotides, respectively. Hudson and co-workers reported that the hydrophobic and stacking interactions of aromatic rings and heterocyclic bases provide the basis for recognition between TIS11D and ARE RNA.69, 73 The mutation of cysteines and histidines (C124H, C124R, H128K, C147R, C162H, and H166L) in the tandem zinc finger domain of TTP resulted in a loss of protein function.68, 74 The conserved aromatic residues in the MBNL1N protein occupy the same position relative to those in the TIS11d protein-ARE complexes, so these residues may be important in the MBNL1N interaction with RNA. We chose six amino acids Phe22, Phe36, Phe54, Ser56, Tyr68, and Tyr188 which are conserved in different muscleblind and zinc finger proteins and mutated them to alanine by site-directed mutagenesis (Figure 3-12).

LEV box

ZNF1

NGR box

MAVSVTPIRDTKWLTLEVCREFQRGTCSRPDTECKFAHPSKSCQVENGRVIACFDSLKGRCSRE

ZNF2

NGR box

NCKYLHPPPHLKTQLEINGRNNLIQQKNMAMLAQQMQLANAMMPGAPLQPVPMFSVAPSLATNA SAAAFNPYLGPVSPSLVPAEILPTAPMLVTGNPGVPVPAAAAAAAQKLMRTDRLEVCREYQRGN

ZNF3

ZNF4

CNRGENDCRFAHPADSTMIDTNDNTVTVCMDYIKGRCSREKCKYFHPPAHLQAKIKAAQYQLEH HHHHHERPHRD Figure 3-12. Amino acid sequence of MBNL1N (truncated form). The sequences which are high-lighted in red represent zinc finger domains 1 to 4 and the one high-lighted in blue represent the LEV and NGR boxes. The amino acids selected for mutation are high-lighted and underlined in green.

68

The binding affinity of mutated proteins with (CUG)12 and S10 RNAs were studied by electrophoretic mobility shift assay. The images of the gels evaluating the binding of proteins to RNA are shown in Figure 3-11 and the results are summarized in Table 3-4. In general, we did not observe significant differences in the binding constants (KD) of the mutated proteins compared to the wild type protein. Two exceptions are Phe54Ala and Tyr68Ala proteins which bind (CUG)12 RNA with 3-fold weaker binding affinity than the wild type protein. Table 3-4. Summary of the binding data for the wild type and mutated proteins with (CUG)12 and S10 RNAs. The Phe54Ala and Tyr68Ala mutants (high-lighted in blue) bind (CUG)12 RNA with 3-fold weaker affinity than the wild type protein. Protein

(CUG)12 KD (nM)

S10 KD (nM)

Wild type Phe22Ala Phe36Ala Phe54Ala Ser56Ala Tyr68Ala Tyr188Ala

150 ± 20 270 ± 55 388 ± 18 509 ± 50 261 ± 43 537 ± 22 161 ± 10

1.3 ± 0.2 1.1 ± 0.1 1.4 ± 0.1 2.0 ± 0.4 1.3 ± 0.4 1.2 0.9

These results suggest that the selected residues are not essential for recognition of (CUG)12 and S10 RNAs. The protein may have more than one binding site for RNA targets or many amino acids may be involved in interactions with RNA. In these conditions the loss of binding affinity with the single point mutation will be compensated by other interactions. The other reason could be that the selected mutations may not destabilize all the zinc finger domains present in the protein and the properly folded zinc finger protein can bind to RNA. The other possibility is that the protein is too long (around 272 amino acids) and one mutation may not be sufficient to destabilize the whole complex. So the double or triple point mutations or mutations in the truncated protein (ZnF3 + ZnF4) are necessary to find out the essential amino acids

69

involved in recognition of RNA targets. In the present study, the protein was mutated only at one zinc finger domain and other domains were not affected. In double or triple point mutation studies we can affect two or three zinc finger domains at the same time that may destabilize the complex. The double or triple point studies may reveal the hot spots of protein and useful in rational design of small molecules to inhibit the formation of MBNL1N-RNA complexes.

3.12.3 Investigation of essential zinc finger domains for recognizing RNA The structurally distinctive feature of muscleblind like proteins is the presence of tandem zinc finger (TZF) domains composed of three cysteines and one histidine residue. These CCCH domains coordinate zinc atoms and bind to single stranded RNA in a sequence specific manner.75 TIS11D is one of three closely related human proteins that belong to the CCCH zinc finger protein family76 and the structure of tandem CCCH-type zinc finger in complex with the AU rich element 5’-UUAUUUAUU-3’ has been determined.69 The two CCCH zinc fingers fold independently, and the linker is flexible in the free state, but upon binding the two domains arrange almost parallel to one another and form several stacking interactions with the single stranded

RNA

molecule.

Unlike

TIS11D,

MBNL1N

recognizes

double

stranded

poly(CUG)RNA and no studies have been completed investigating binding to double stranded RNA. Therefore, it is very interesting to evaluate the contributions of the zinc finger domains to RNA recognition. The aim is to uncover the minimal zinc finger domains important for binding to poly(CUG)RNA. In this study, we performed a deletion analysis of MBNL1N to identify which fragment or fragments contribute in binding to (CUG)12 and S10 RNA. To address this question, we made two truncated proteins having zinc finger 1 and zinc finger 3 and 4 (Figure 3-13). The experimental evaluation of individual deletion mutants (ZnF1 and ZnF3 + ZnF4) binding to

70

RNA targets has been studied. The truncated peptides (ZnF1 and ZnF3 + ZnF4) did not bind to (CUG)12 and S10 RNA (Figure 3-14). These results suggest that the protein needs all four zinc finger motifs to recognize RNA. It is possible that the studied zinc finger motifs do not fold properly for recognizing (CUG)12 and S10 RNAs.

Figure 3-13. Truncated proteins prepared for experimental studies.

Figure 3-14. Representative electrophoretic mobility shift assays of deletion mutant (ZnF1 and ZnF3 + ZnF4) proteins binding to (CUG)12 and S10 RNAs. The protein samples were 2 fold serially diluted, incubated with 32P-labeled RNAs at room temperature for 20 min and electrophoresis was performed using a 6% native polyacrylamide gel at 4 0C in 0.5 X Tris-Borate buffer (pH=8). The first lane from the left is free RNA and there are no complex bands. The

71

concentrations shown at the bottom of each gel picture represent the highest concentrations of protein used for the experiment.

3.13 Conclusion The recognition of MBNL1N of its natural and pathogenic RNA targets may not be specific because the binding affinity of the protein decreases with increasing yeast tRNA concentration. This suggests that the protein could bind to RNA targets other than the natural targets already identified. The lack of specificity could also be a reason for colocalization of MBNLIN protein with poly(CUG)RNA repeats. The MBNL1N binding to (CUG)4 RNA is 5fold better than to (CUG)12, this may be due to the unstable secondary structure of (CUG)4 RNA and MBNL1N binds single stranded RNA better than to a duplex. The affinity of MBNL1N with S10 RNA is 100-fold better than (CUG)12 RNA, which indicates it prefers natural target compared to pathogenic RNAs. Substitution of the selected amino acids Phe22, Phe36, Phe54, Ser56, Tyr68, and Tyr188 with alanine did not result in a large destabilization of complexes formed with either poly(CUG)RNA or S10 RNA. One possible explanation could be that the MBNL1N protein may have more than one site for binding to RNA targets or the chosen amino acids are not involved in any interaction with RNA. Double or triple point mutations or mutations in the truncated protein (ZnF3 + ZnF4) may reveal the essential amino acids involved in recognition of RNA targets. It is clear that all the zinc finger motifs are required for binding because the truncated peptides (ZnF1 and ZnF3 + ZnF4) do not bind to (CUG)12 or S10 RNA. Further studies can reveal the hot spots of protein and useful in rational design of small molecules to inhibit the formation of the MBNL1N-RNA complexes.

72

3.14 Materials and methods DNA oligomers were obtained from Integrated DNA Technologies and purified by ethanol precipitation. Purified RNA sequences were obtained form Dharmacon research. Yeast tRNA was purchased from Sigma-Aldrich. Acrylamide and bisacrylamide solutions used in native and denaturing gels were purchased from National Diagnostics, Inc. γ-32P ATP was purchased from Amersham Biosciences Corp. An expression vector for MBNL1N (1-272) and vectors containing (CTG)54 and (CTG)90 sequences were obtained from Maurice S. Swanson (University of Florida College of Medicine, Gainesville, FL).77-78 Water was purified and de-ionized using Bio cell A10 Milli-Q by Millipore. Samples were centrifuged with an eppendorf 5415 C, and lyophilized using a speedvac concentrator and a Lyph-lock 4.5 lyophilizer by Labcon. The gel mobility shift assays were kept at constant temperature using a VWR 160 constant temperature bath, and gels were dried using Fisher Biotech gel drying system and developed using a phosphorimage screen from Molecular Dynamics. The gels were scanned using the Storm840 by Molecular Dynamics, and visualized with Image Quant 5.1 software.

3.14.1 Recombinant protein expression and purification The proteins were over expressed in E. coli strain (BL21-CodonPlus (DE3)-RP competent cells (Stratagene)) containing pGEx-6P-1-MBNL1N or MBNL1N mutants or MBNL1N truncated forms (ZnF1 and ZnF3 + ZnF4) until the O.D600 approximately reached 0.5. Expression was then induced with 1 mM IPTG (isopropyl β-D-1-thioglactopyranoside) for 2 hours at 37 oC. Bacterial cells were collected by centrifugation and were then resuspended in lysis buffer containing 25 mM Tris-Cl (pH=8.0), 0.5 M NaCl, 10 mM imidazole, 2 mg/ml lysozyme, 5% glycerol, 2 mM 2- mercaptoethanol, 0.1% Triton X-100, 1 µM pepstatin, 0.1 mM

73

PMSF (phenylmethanesulphonylfluoride), and 1 µM leupeptin. The solution was sonicated six times for 15-20 s pulses each with resting time of 10 s between pulses. The cell suspension was centrifuged at 10,000 rpm for 10 minutes and the supernatant was collected and filtered through a 0.45 µm Millex Filter (Millipore). The cell lysate was incubated with Ni-NTA agarose (QIAGEN) beads for 1 h at 4 oC and then the beads were washed (3 times, 15 ml each) with a washing buffer containing 25 mM TrisCl (pH=8), 20 mM imidazole, 0.5 M NaCl, and 0.1% Triton X-100, followed by elution with elution buffer of 25 mM Tris-Cl (pH=8), 0.5 M NaCl, 250 mM imidazole and 0.1% Triton X100. Subsequently, 10 mM of 2-Mercaptoethanol was added to the eluate containing the GST fusion protein and incubated with Glutathione Sepharose 4B (GE Healthcare) for 1 hour at 4 oC. The beads were washed 3 times (15 ml each) with a washing buffer containing 25 mM Tris-Cl (pH=8), 300 mM NaCl, 5 mM 2-mercaptoethanol and 0.1% TritonX-100. The beads were collected and incubated with PreScission Protease (GE Healthcare) overnight at 4 oC. The enzyme cleaves the GST-tag, and the proteins were collected in the column flow-through and concentrated with a Microcon Centrifugal Filter 3000 MWCO (Millipore). The purity of the proteins was analyzed by SDS PAGE gel (silver and coomassie stained) (Figures 3-15 to 3-17) and MALDI mass spectrometry (Figures 3-18 to 3-26). The concentration of protein samples were determined by amino acid analysis or Bradford assay. The samples were sent to Keck Biotech facility at Yale University, New Haven, CT for amino acid analysis.

3.14.2 SDS gel electrophoresis The MBNL1N mutant protein samples and truncated proteins were analyzed by 8% and 17% acrylamide gels, respectively. The wild type protein was stained with silver stain and the other proteins samples were stained with Coomassie stain following standard protocols. The

74

upper bands observed in mutant protein SDS gel are may be due to tetramer complex of MBNL1N (Figure 3-16).

Figure 3-15. Silver stained SDS gel picture for analyzing the purity of wild type MBNL1N protein

Figure 3-16. Coomassie stained SDS gel picture for analyzing the purity of mutant MBNL1N proteins

75

Figure 3-17. Coomassie stained SDS gel picture for analyzing the purity of truncated MBNL1N protein

Figure 3-18. MALDI mass spectrum of MBNL1N protein. Calculated MW: 30133. Observed MW: 30169

76

Figure 3-19. MALDI mass spectrum of Phe22Ala mutant protein. Calculated MW: 30040. Observed MW: 30021.

Figure 3-20. MALDI mass spectrum of Phe36Ala mutant protein. Calculated MW: 30040. Observed MW: 30005.

77

Figure 3-21. MALDI mass spectrum of Phe54Ala mutant protein. Calculated MW: 30040. Observed MW: 30063.

Figure 3-22. MALDI mass spectrum of Ser56Ala mutant protein. Calculated MW: 30010. Observed MW: 30083.

78

Figure 3-23. MALDI mass spectrum of Tyr68Ala mutant protein. Calculated MW: 30030. Observed MW: 30022.

Figure 3-24. MALDI mass spectrum of Tyr188Ala mutant protein. Calculated MW: 30030. Observed MW: 30011.

79

Figure 3-25 MALDI mass spectrum of ZnF1 truncated protein. Calculated MW: 8079. Observed MW: 8098.

Figure 3-26. MALDI mass spectrum of ZnF3+ZnF4 truncated protein. Calculated MW: 11513. Observed MW: 11518.

80

3.14.3 Radio labeling of 5’ end of RNA To a 1.5 ml screw-cap tube, 10 µl of 5 picomoles of RNA, 5 µl [γ-32P] ATP, 2 µl 10× kinase buffer (70 mM Tris-HCl pH 7.6, 10 mM MgCl2, 5 mM dithiothreitol, New England Biolabs, Inc.), 3 µl water (to bring the total volume to 20 µl) and 1 µl T4 polynucleotide kinase (10,000 units/ml, New England Biolabs, Inc.) added and mixed properly. The reaction was incubated at 37 °C for 1.5 hrs. After incubation, the reaction was diluted with 29 µl water to bring the total volume to 50 µl. Then the reaction mixture was extracted with 50 µl phenol/chloroform/isoamylalcohol (25:24:1), and 50 µl of chloroform/alcohol (24:1) solution, respectively. The sample was vortexed for 1 min and centrifuged at 14,000 rpm for 1 min. The lower organic layer was removed and the upper aqueous layer containing the RNA was left in the tube. The same procedure followed for both the extractions. The final solution was ethanol precipitated twice and the RNA pellet was dried in speed vac for 20 mins, and dissolved in 50 µl TE buffer to obtain a 100 nM labeled RNA solution. The 100 nM RNA solution was further diluted by 20 fold (2.5 µl 100 nM labeled RNA solution diluted to 50 µl with TE buffer) to obtain a 5 nM labeled RNA solution.

3.14.4 Equilibrium binding assays RNA was labeled with [γ-32P]ATP using T4 polynucleotide kinase (New England Biolabs) enzyme. Labeled RNA was heated at 95 0C for 2 min and then placed on ice for 10 min in RNA storage buffer (66 mM NaCl, 6.7 mM MgCl2, and 27 mM Tris (pH 7.5). The protein (MBNL1N) was serially diluted in binding buffer (175 mM NaCl, 5 mM MgCl2, 20 mM Tris (pH 7.5), 1.25 mM BME, 12.5% glycerol and 2 mg/mL BSA, and 0.1 mg/mL heparin) and 5 µl of protein solution added to 5 ul of aliquoted RNA solution. The reaction mixture was incubated at room temperature for 25 min and loaded onto a 6% polyacrylamide gel (80:1) prechilled at 4

81

0

C. The gels were run for 1 hr at 360V in 0.5 X Tris-Borate buffer (pH=8), dried and

autoradiographed. The binding of all the mutants and MBNL1N wild type to different RNA targets were determined in the absence of tRNA. MBNL1N binding to (CUG)4 and (CUG)12 was also determined in the presence of yeast tRNA (The sample was prepared considering tRNA molecular weight as 25,000 Daltons). The gels were exposed to a phosphorimager screen for overnight and individual bands were quantified using ImageQuant software (Molecular Dynamics) and the data fit by using Kaleidagraph 3.5 (synergy) software. The apparent KD values were obtained by fitting fraction RNA bound versus protein concentration using the following equations Y = 1/(1+(m1/m0)), where Y= Fraction bound, m1=KD and m0= Protein concentration.

3.14.5 Site directed mutagenesis (Alanine scanning) The selected six amino acids were mutated to alanine by site directed mutagenesis using the following forward and reverse primers for the pGEX-6P-1/MBNL1N vector. Phe22Ala: Forward Primer 5’-GTATGTAGAGAGGCCCAGAGGG-3’ Reverse Primer 5’-GTCCCCCTCTGGGCCTCTCTAC-3’ Phe36Ala: Forward Primer 5’-GGAATGTAAAGCTGCACATCCTTCG-3’ Reverse Primer 5’-CGAAGGATGTGCAGCTTTACATTCC-3’ Phe54Ala: Forward Primer 5’-ATCGCCTGCGCTGATTCATT-3’ Reverse Primer 5’-AATGAATCAGCGCAGGCGAT-3’

82

Ser56Ala: Forward Primer 5’-CCTGCTTTGATGCATTGAAAGGC-3’ Reverse Primer 5’-GCCTTTCAATGCATCAAAGCAGG-3’ Tyr68Ala: Forward Primer 5’-GAACTGCAAAGCTCTTCATCCA-3’ Reverse Primer 5’-TGGATGAAGAGCTTTGCAGTTC-3’ Tyr188Ala: Forward Primer 5’-TATGTCGAGAGGCCCAACGT-3’ Reverse Primer 5’-ACGTTGGGCCTCTCGACATA-3’ The PCR product was analyzed by agarose gel. The agarose gel was stained with ethidium bromide and the bands were identified using an ultraviolet transilluminator. The PCR sample was mixed with the enzyme DPn1 (Invitrogen), incubated for 1 hr to cleave the original template, and transformed into XL1-blue competent cells (Stratagene). The transformed cells were plated on the agar plate containing ampicillin antibiotic. The cells were grown overnight and the DNA was isolated from the cell pellet using QIAquick Spin Miniprep Kit (QIAGEN). The mutations were confirmed by DNA sequencing analysis.

3.14.6 Molecular cloning The pGEX-6P-1/MBNL1N vector was digested with BamHI and XhoI (Invitrogen), the 5’ end was dephosphorylated with calf intestinal alkaline phosphatase (Invitrogen) enzyme and finally purified from an agarose gel using QIAquick Gel Extraction Kit (QIAGEN). The ZnF1 (Met1 to Iso51) truncated protein insert was constructed by amplifying the zinc finger 1 region of pGEX-6P-1/MBNL1N using the forward primer 5’-CGGGATCCATGGCTGTTAGTGTCA3’ and the reverse primer 5’-CCGCTCGAGGATTACTCGTCCATT-3’. The PCR product was

83

digested with BamHI and XhoI (Invitrogen), purified from an agarose gel and ligated with pGEX-6P-1 backbone vector containing an N-terminal GST tag and C-terminal His6-tag by incubating with T4 DNA ligase (Invitrogen). The ZnF3+ZnF4 (Leu176 to Glu256) truncated protein insert was constructed by amplifying zinc finger 3 and 4 region of pGEX-6P-1/MBNL1N using the forward primer 5’-CGGGATCCATGTTAATGCGAACAGAC-3’ and the reverse primer 5’-CCGCTCGAGCTGGTATTGGG-3’ and sub-cloned into pGEX-6P-1 following the same procedure as for ZnF1. The inserted sequences were confirmed by DNA sequencing analysis and the expressed truncated proteins were analyzed by running SDS gel and MALDI mass spectrometry (Figures 3-17, 3-25 and 3-26).

3.14.7 Transcription of p(CTG)54 and p(CTG)90 plasmids The plasmids p(CTG)54 and p(CTG)90 containing (CTG)54 and (CTG)90 sequences in pSP72 vector were transformed into XL1-blue competent cells and grown overnight. The cell suspension was centrifuged and DNA was isolated form the cell pellet using QIAquick Spin Miniprep Kit (QIAGEN). The plasmids were digested with BamHI (Invitrogen) and the linear plasmids purified with 1% agarose gel and using QIAquick Gel Extraction Kit (QIAGEN) (Figure 3-27). The linear plasmids were transcribed using Ambion MEGAscript T7 kit (incubated with T7 polymerase for more than 5 hrs), and the transcript was purified by phenol:chloroform extraction, isopropanol precipitation, and on a 6% denaturing acrylamide gel. Typically, 20 µL transcription reaction was mixed with 115 µL nuclease-free water and 15 µL ammonium acetate, and mixed thoroughly. The solution was extracted with equal volume of water saturated phenol/chloroform, and then with equal volume of chloroform. The aqueous phase was recovered and transferred in to a new tube and RNA was precipitated by adding 1 volume of

84

isopropanol and mixing well. Finally, the precipitated RNA was purified on a 6% denaturing gel to remove short RNA transcripts. The identity of the transcript was further confirmed by MALDI mass spectrometry (Figure 3-28).

Figure 3-27. DNA agarose gels for the purified circular and linear (CTG)54 and (CTG)90 plasmids.

Figure 3-28. MALDI mass spectrum of (CUG)90 RNA. Calculated MW: 95051. Observed MW: 93566.

85

3.15 References 1. Cusack, S. RNA-protein complexes. Curr. Opin. Chem. Biol. 1999, 9, 66-73. 2. Dejong, E. S.; Luy, B.; Marino, J. P. RNA and RNA-protein complexes as targets for therapeutic intervention. Curr. Top. Med. Chem. 2002, 2, 289-302. 3. Howard, A. D.; McAlllister, G.; feighner, S. D.; Liu, Q. Y.; Nargund, R. P.; Van der Ploeg, L. H. T,; Patchett, A. A. Orphan G-protein-coupled receptors and natural ligand discovery. Trends in Pharm. Sci. 2001, 22, 132-140. 4. Sautel, M.; Milligan, G. Molecular manipulation of G-protein-coupled receptors: A new avenue into drug discovery. Curr. Med. Chem. 2000, 7, 889-896. 5. Zapp, M. L.; Stern, S.; Green, M. R. Small molecules that selectively block RNA binding of HIV-1 REV protein inhibit Rev function and viral production. Cell 1993, 74, 969-978. 6. Cho, J.; Rando, R. R. Specificity in the binding of aminoglycosides to HIV-RRE RNA Biochemistry 1999, 38, 8548-8554. 7. Hermann, T. Strategies for the design of drugs targeting RNA and RNA-protein complexes. Angew. Chem. Int. Ed. 2000, 39, 1890-1905. 8. Patel, D. J. Adaptive recognition in RNA complexes with peptides and protein modules. Curr. Opin. Struct. Biol. 1999, 9, 74-87. 9. Hermann, T.; Westhof, E. Non-Watson-Crick base pairs in RNA-protein recognition. Chem. Biol. 1999, 6, R335-343. 10. Draper, D. E.; Reynaldo, L. P. RNA binding strategies of ribosomal proteins. Nucleic Acids Res. 1999, 27, 381-388. 11. Nagai, K. RNA-protein complexes. Curr. Opin. Struct. Biol. 1996, 6, 53-61.

86

12. Hermann, T.; Patel, D. J. Adaptive recognition by nucleic acid aptamers. Science 2000, 287, 820-825. 13. Frankel, A. D.; Smith, C. A. Induced folding in RNA-protein recognition: more than simple molecular handshake. Cell 1998, 92, 149-151. 14. Nadassy, K.; Wodak, S. J.; Janin, J. Structural features of protein-nucleic acid recognition sites. Biochemistry 1999, 38, 1999-2017. 15. Hermann, T.; Auffinger, P.; Westhof, E. Molecular dynamics investigations of hammerhead ribozyme. Eur. Biophys. J. 1998, 27, 153-165. 16. Gesteland, R. F.; Cech, T. R.; Atkins, J. F. eds. 1999, The RNA world, 2nd ed. Cold Spring Harbor, New York, Cold Spring Harbor Laboratory Press. 17. Musunuru, K. Cell specific RNA-binding proteins in human diseases. Trends Cardiovasc. Med. 2003, 13, 188-195. 18. Willingham, A. T.; Gingeras, T. R. TUF love for ‘junk’ DNA. Cell 2006, 125, 1215-1220. 19. Osborne, R. J.; Thornton, C. A. RNA-dominant diseases. Hum. Mol. Gen. 2006, 15, R162R169. 20. Fu, Y. H.; Kuhl, D. P. A.; Pizzuti, A.; Pieretti, M.; Sutcliffe, J. S.; Richards, S.; Verkert, A. J. M. H.; Holden, J. J. A.; Fenwick, R. G.; Warren, S. T.; Oostra, B. A.; Nelson, D. L.; Caskey, C. T. Variation of the CGG repeat at the fragile X site results in genetic instability: resolution of the sherman paradox. Cell 1991, 67, 1047–1058. 21. Kremer, E. J.; Pritchard, M.; Lynch, M.; Yu, S.; Holman, K.; Baker, E.; Warren, S. T.; Schlessinger, D.; Sutherland, G. R.; Richards, R. I. Mapping of DNA instability at the fragile X to a trinucleotide repeat sequence p(CCG)n. Science 1991, 252, 1711–1714.

87

22. La Spada, A. R.; Wilson, E. M.; Lubahn, D. B.; Harding, A. E.; Fischbeck, K. H. Androgen receptor gene mutations in X-linked spinal and bulbar muscular atrophy. Nature 1991, 352, 77– 79. 23. Brook, J. D.; McCurrach, M. E.; Harley, H. G.; Buckler, A. J.; Church, D.; Aburatani, H.; Hunter, K.; Stanton, V. P.; Thirion, J. P.; Hudson, T.; Sohn, R.; Zomelman, B.; Snell, R. G.; Rundel, S. A.; Crow, S.; Davies, J.; Shelbourne, P.; Buxton, J.; Jones, C.; Juvonen, V.; Johnson, K.; Harper, P. S.; Shaw, D. J.; Housman, D. E. Molecular basis of myotonic dystrophy: expansion of a trinucleotide (CTG) repeat at the 3' end of a transcript encoding a protein kinase family member. Cell 1992, 68, 799-808. 24. Fu, Y. -H.; Pizzuti, A.; Fenwick, R. G.; Jr.; King, J.; Rajnarayan, S.; Dunne, P. W.; Dubel, J.; Nasser, G. A.; Ashizawa, T.; De Jong, P.; Wieringa, B.; Korneluk, R.; Perryman, M. B.; Epstein, H. P.; Caskey, C. T. An unstable triplet repeat in a gene related to myotonic muscular dystrophy. Science 1992, 255, 1256-1258. 25. Buxton, J.; Shelbourne, P.; Davies, J.; Jones, C.; Van Tongeren, T.; Aslanidis, C.; de Jong, P.; Jansen, G.; Anvret, M.; Riley, B.; Williamson, R.; Johnson, K. Detection of an unstable fragment of DNA specific to individuals with myotonic dystrophy. Nature 1992, 355, 547-548. 26. Harley, H. G.; Brook, J. D.; Rundle, S. A.; Crow, S.; Reardon, W.; Buckler, A. J.; Harper, P. S.; Houseman, D, E.; Shaw, D. Expansion of an unstable DNA region and phenotypic variation in myotonic dystrophy. Nature 1992, 355, 545-546. 27. Mahadevan, M.; Tsilfidis, C.; Sabourin, L.; Shutler, G.; Amemiya, C.; Jansen, G.; Neville, C.; Narang, M.; Barcelo, J.; O’Hoy, K.; Leblond, S.; Earle-MacDonald, J.; De Jong, P. J.; Wieringa, B.; Koneluk, R. G. Myotonic dystrophy mutation: an unstable CTG repeat in the 3’ untranslated region of the gene. Science 1992, 255, 1253-1255.

88

28. Koob, M. D.; Moseley, M. L.; Schut, L. J.; Benzow, K. A.; Bird, T. D.; Day, J. W.; Ranum, L. P. An untranslated CTG expansion causes a novel form of spinocerebellar ataxia (SCA8). Nat. Genet. 1999, 21, 379–384. 29 Timchenko, L. T.; Caskey, C. T. Triplet repeat disorders: discussion of molecular mechanisms. Cell. Mol. Life Sci. 1999, 55, 1432–1447. 30. Steinert, H. Dtsch, Z. Nervenheilkd. 1909, 58. 31. Harper P. S. Myotonic dystrophy. 2001, 37, W. B. Saunders, London. 32. Ricker, K.; Koch, M. C.; Lehmann-Horn, F.; Pongratz, D.; Otto, M.; Heine, R.; Moxley, R. T. III. Proximal myotonic myopathy: a new dominant disorder with myotonia, muscle weakness, and cataracts. Neurology 1994, 44, 1448–1452. 33. Rowland, L. P. Thornton-Griggs-Moxley disease: myotonic dystrophy type 2. Ann. Neurol. 1994, 36, 803–804 34. Thornton, C. A.; Griggs, R. C.; Moxley, R. T. Myotonic dystrophy with no trinucleotide repeat expansion. Ann. Neurol. 1994, 35, 269–272. 35. Harper, P. S.; Monckton, D. G. Myotonic dystrophy, in Engel AG, Franzini-Armstrong C. (eds): Myology (ed 3). New York, NY, McGraw Hill Professional, 2004, 1039-1076. 36. Day J. W.; Ranum, L. P. RNA pathogenesis of the myotonic dystrophies. Neuromuscul. Disord. 2005, 15, 5-16. 37. Machuca-Tzili, L.; Brook, D.; Hilton-Jones, D. Clinical and molecular aspects of the myotonic dystrophies: a review. Muscle Nerve 2005, 32, 1-18. 38. Day, J. W.; Ricker, K.; Jacobsen, J. F.; Rasmussen, L. J.; Dick, K. A.; Kress, W.; Schneider, C.; Koch, M. C.; Beilman, G. J.; Harrison, A. R.; Dalton, J. C.; Ranum, L. P. Myoyonic dystrophy type 2: molecular, diagnostic, and clinical spectrum. Neurology 2003, 60, 657-664.

89

39. Davis, B. M.; McCurrach, M. E.; Taneja, K. L.; Singer, R. H.; Housman, D. E. Expansion of a CUG trinucleotide repeat in the 3’ untranslated region of myotonic dystrophy protein kinase transcripts results in nuclear retention of transcripts. Proc. Natl. Acad. Sci. 1997, 94, 7388-7393. 40. Fardaei, M.; Larkin, K.; Brook, J. D.; Hamshere, M. G. In vivo co-localization of MBNL protein with DMPK expanded-repeat transcripts. Nucleic. Acids Res. 2001, 29, 2766-2771. 41. Fardaei, M.; Rogers, M. T.; Thorpe, H. M.; Larkin, K.; Hamshere, M. G.; Harper, P. S.; Brook, J. D. Three proteins, MBNL, MBLL, MBXL, co-localize in vivo with nuclear foci of expanded-repeat transcripts in DM1 and DM2 cells. Hum. Mol. Genet. 2002, 11, 805-814. 42. Taneja, K. L.; McCurrach, M.; Schalling, M.; Housman, D.; Singer, R. H. Foci of trinucleotide repeat transcripts in nuclei of myotonic dystrophy cells and tissues. J. Cell. Biol. 1995, 128, 995-1002. 43. Liquori, C. L.; Ricker, K.; Moseley, M. L.; Jacobsen, J. F.; Kress, W.; Naylor, S. L.; Day, J. W.; Ranum, L. P. Myotonic dystrophy type 2 caused by a CCTG expansion in intron 1 of ZNF9. Science 2001, 293, 864-867. 44. Ranum, L. P.; Day, J. W. Myotonic dystrophy: clinical and molecular parallels between myotonic dystrophy type 1 and type 2. Curr. Neuro. Neurosci. Rep. 2002, 2, 465-470. 45. Kuyumcu-Martinez, N. M.; Cooper, T. A. Misregulation of alternative splicing causes pathogenesis in myotonic dystrophy. Prog. in Mol. and Subcell. Biol. Philippe Jeanteur (Ed.), Alternative splicing and disease, Springer-Verlag Berlin Heidelberg, 2006. 46. Black, D. L. Mechanisms of alternative pre-messenger RNA splicing. Annu. Rev. Biochem. 2003, 27, 27-48. 47. Lopez, A. J. Alternative splicing of pre-mRNA: developmental consequences and mechanisms of regulation. Annu. Rev. Genet. 1998, 32, 279-305.

90

48. Faustino, N. A.; Cooper, T. A. Pre-mRNA splicing and human disease. Genes Dev. 2003, 17, 419-437. 49. Jiang, H.; Mankodi, A.; Swanson, M. S.; Moxely, R. T.; Thronton, C. A. Myotonic dystrophy is associated with nuclear foci of mutant RNA, sequestration of muscleblind proteins and deregulated alternative splicing in neurons. Hum. Mol. Genet. 2004, 13, 3079-3088. 50. Mankodi, A.; Takahashi, M. P.; Jiang, H.; Beck, C. L.; Bowers, W. J.; Moxley, R. T.; Cannon, S. C.; Thornton, C. A. Expanded CUG repeats trigger aberrant splicing of C1C-1 chloride channel pre-mRNA and hyperexcitability of skeletal muscle in myotonic dystrophy. Mol. Cell. 2002, 10, 35-44. 51. Savkur, R. S.; Philips, A. V.; Cooper, T. A. Aberrant regulation of insulin receptor alternative splicing is associated with insulin resistance in myotonic dystrophy. Nat. Gen. 2001, 29, 40-47. 52. Charlet-B, N.; Savkur, R. S.; Singh, G.; Philips, A. V.; Grice, E. A.; Cooper, T. A. Loss of the muscle-specific chloride channel in type 1 myotonic dystrophy due to misregulated alternative splicing. Mol. Cell. 2002b, 10, 45-53. 53. Ranum, L. P. W.; Cooper, T. A. RNA-mediated neuromuscular disorders. Annu. Rev. Neurosci. 2006, 29, 259-277. 54. Ho, T. H.; Charlet-B, N.; Poulos, M. G.; Singh, G.; Swanson, M. S.; Cooper, T. A. Muscleblind proteins regulate alternative splicing. EMBO J. 2004, 23, 3103-3112. 55. Philips, A. V.; Timchenko, L.T.; Cooper, T. A. Disruption of splicing regulated by a CUGbinding protein in myotonic dystrophy. Science 1998, 280, 737-741.

91

56. Artero, R.; Prokop, A.; Paricio, N.; Begemann, G.; Pueyo, I.; Mlodzik, M.; PerezAlonso, M.; Baylies, M. K. The muscleblind gene participates in the organization of Z-bands and epidermal attachments of Drosophila muscles and is regulated by Dmef2. Dev. Biol. 1998, 195, 131-143. 57. Begemann, G.; Paricio, N.; Artero, R.; Kiss, I.; Perez-Alonso, M.; Mlodzik, M. Muscleblind, a gene required for photoreceptor differentiation in Drosophila, encodes novel nuclear Cys3Histype zinc-finger-containing proteins. Development 1997, 124, 4321-4331. 58. Miller, J. W.; Urbinati, C. R.; Teng-Umnuay, P.; Stenberg, M. G.; Byrne, B. J.; Thornton, C. A.; Swanson, M. S. Recruitment of human muscle blind proteins to (CUG)n expansions associated with myotonic dystrophy. EMBO J. 2000, 19, 4439-4448. 59. Ho, T. H.; Savkur, R. S.; Poulos, M. G.; Mancini, M. A.; Swanson, M. S.; Cooper, T. A. Colocalization of muscleblind with RNA foci is separable from misregulation of alternative splicing in myotonic dystrophy. J. Cell Sci. 2005b, 118, 2923-2933. 60. Lin, X.; Miller, J. W.; Mankodi, A.; Kanadia, R. N.; Yuan, Y.; Moxley, R. T.; Swanson, M. S.; Thornton, C. A. Failure of MBNL1-dependent post-natal splicing transitions in myotonic dystrophy. Hum. Mol. Genet. 2006, 15, 2087-2097. 61. Kanadia, R. N.; Urbinati, C. R.; Crusselle, V. J.; Luo, D.; Lee, Y. J.; Harrison, J. K.; Oh, S. P.; Swanson, M. S. Developmental expression of mouse muscleblind genes Mbnl1, Mbnl2 and Mbnl3. Gene. Expr. Patterns. 2003b, 3, 459-462. 62. Dansithong, W.; Paul, S.; Comai, L.; Reddy, S. MBNL1 is the primary determinant of focus formation and aberrant insulin receptor splicing in DM1. J. Biol. Chem. 2005, 280, 5773-5780. 63. Paul, S.; Dansithong, W.; Kim, D.; Rosi, J.; Webster, N. J.; Comai, L.; Reddy, S. Interactions of muscleblind, CUG-BP1 and hnRNP H proteins in DM1-associated aberrant IR splicing. EMBO J. 2006, 25, 4271-4283.

92

64. Teplova, M.; Patel, D. J. Structural insights into RNA recognition by the alternative-splicing regulator muscleblind-like MBNL1. Nat. Struct. Mol. Biol. 2008, 15, 1343-1351. 65. Cheng, Y.; Kato, N.; Wang, W.; Li, J.; Chen, X. Two RNA binding proteins, HEN4 and HUA1, act in the processing of AGAMOUS pre-mRNA in Arabidopsis thaliana. Develop. Cell 2003, 4, 53-66. 66. Bai, C.; Tolia, P. P. Drosophila clipper/CPSF 30K is a post-transcriptionally regulated nuclear protein that binds RNA containing GC clusters. Nucleic Acids Res. 1998, 26, 1597-1604. 67. Hendriks, E. F.; Robinson, D. R.; Hinkins, M.; Matthews, K. R. A novel CCCH protein which modulates differentiation of Trypanosoma brucei to its procyclic form. EMBO J. 2001, 20, 1-12. 68. Lai, W. S.; Carballo, E.; Strum, J. R.; Kennington, E. A.; Phillips, R. S.; Blackshear, P. J. Evidence that tristetraprolin binds to AU-rich elements and promotes the deadenylation and destabilization of tumor necrosis factor alpha mRNA. Mol. Cell. Biol. 1999, 19, 4311-4323. 69. Hudson, B. P.; Martinez-Yamout, M. A.; Dyson, H. J.; Wright, P. E. Recognition of the mRNA AU-rich element by the zinc finger domain of TIS11d. Nat. Struct. Mol. Biol. 2004, 11, 257-264. 70. Warf, M. B.; Berglund, J. A. MBNL binds similar RNA structures in the CUG repeats of myotonic dystrophy and its pre-mRNA substrate cardiac troponin T. RNA 2007, 13, 2238-2251. 71. Kino, Y.; Mori, D.; Oma, Y.; Takeshita, Y.; Sasagawa, N.; Ishiura, S. Muscleblind protein, MBNL1/EXP, binds specifically to CHHG repeats. Hum. Mol. Genet. 2004, 13, 495–507. 72. Pascual, M.; Vicente, M.; Monferrer, L.; Artero, R. The muscleblind family of proteins: an emerging class of regulators of developmentally programmed alternative splicing. Differentiation 2006, 74, 65-80.

93

73. Brown, R. S. Zinc finger proteins: getting a grip on RNA. Curr. Opin. Struct. Biol. 2005, 15, 94-98. 74. Lai, W. S.; Kennington, E. A.; Blackshear, P. J. Interactions of CCCH zinc finger proteins with mRNA: non-binding tristetraprolin mutants exert an inhibitory effect on degradation of AUrich element-containing mRNAs. J. Biol. Chem. 2002, 277, 9606–9613. 75. Worthington, M. T.; Amann, B. T.; Nathans, D.; Berg, J. M. Metal binding properties and secondary structure of the zinc-binding domain of Nup475. Proc. Natl. Acad. Sci. USA. 1996, 93, 13754-13759. 76. Carrick, D. M.; Lai, W. S.; Blackshear, P. J. The tandem CCCH zinc finger protein tristetraprolin and its relevance to cytokine mRNA turnover and arthritis. Arthritis Res. Ther. 2004, 6, 248-264. 77. Yuan, Y.; Compton, S. A.; Sobczak, K.; Stenberg, M. G.; Thornton, C. A.; Griffith, J. D.; Swanson, M. S. Muscleblind-like 1 interacts with RNA hairpins in splicing target and pathogenic RNAs. Nucleic Acids Res. 2007, 35, 5474-5486. 78. Michalowski, S.; Miller, J. W.; Urbinati, C. R.; Paliouras, M.; Swanson, M. S.; Griffith, J. Visualization of double-stranded RNAs from the myotonic dystrophy protein kinase gene and interactions with CUG-binding protein. Nucleic Acids Res. 1999, 27, 3534-3542.

94

CHAPTER 4 Inhibition of Pathogenic RNAs-MBNL1N Complexes with Small Molecules 4.1 Introduction As discussed in chapter 3 the disease myotonic dystrophy is caused by a deleterious gainof-function of either poly(CUG) or poly(CCUG)RNA. One mechanism by which the RNA induces pathogenesis is sequestration of the protein called MBNL1 (muscleblind-like 1). The depletion of this protein in the cell results in misregulation of alternative splicing of different pre-mRNAs, causing features of the disease. One therapeutic strategy suggested is to find a drug that can bind to RNA repeats and free the MBNL1N protein to perform its normal functions.1 There have been studies focused on targeting RNA repeats with small molecules, peptides and oligonucleotides. The morpholino antisense oligonucelotide CAG25, which is 25 nucleotides in length, was used to block the binding of the MBNL1 protein. CAG25 was able to inhibit the interaction between poly(CUG)RNA and MBNL1N protein and reduced the overall burden of this toxic RNA. However, delivery of oligonucleotides is difficult, especially for multisystemic disorders like myotonic dystrophy.2 Pushechnikov and co-workers showed that compounds related to Hoechst 33258 destabilize the poly(CUG)RNA-MBNL1 complex, but the monomeric ligand is not specific and the pentameric ligands are too bulky for transportation into nucleus.3 Gareiss and co-workers screened a dynamic library of peptides containing 11325 members and found some hits that can disrupt the poly(CUG)RNA-MBNL1 complex, but the inhibition constants are in low micromolar values and peptide drugs are not stable compared to small molecules.4 Thus, there is a need to design small molecules that can specifically bind to CUG or

95

CCUG repeats and inhibit the interaction with MBNL1 protein. We are performing this project in collaboration with Professor Zimmerman and his group at the University of Illinois, UrbanaChampaign. Dr. Jonathan F. Arambula and Chun-Ho Wong designed and synthesized the small molecules that specifically bind to poly(CUG)RNA or poly(CCUG)RNA and I evaluated their affinity for RNA and ability to destabilize complexes formed between MBNL1N and poly(CUG)RNA and poly(CCUG)RNAs.

4.2 Results and discussion 4.3 Inhibition of the poly(CUG)RNA-MBNL1N complex 4.3.1 Rational design of ligands Myotonic dystrophy type 1 (DM1) is most likely caused by the sequestration of MBNL1 protein by poly(CUG)RNA. Molecules were designed and synthesized by Dr. Jonathan F. Arambula from the Zimmerman group to bind selectively to the U-U mismatches found in poly(CUG)RNA. The basis for the rational design of molecules is targeting U-U mismatches of poly(CUG)RNA. Two structural features of uracil were considered: (1) the hydrogen bond acceptor-donor-acceptor (A−D−A) motif normally paired with adenine, and (2) the aromatic surface of the monocyclic pyrimidine. Normally monocyclic uracil forms base pairs with bicyclic adenine. A U-U (T-T) mismatch will be destabilized in part because both residues are monocyclic. This destabilized site within the duplex allows for binding and recognition within the mismatch. Small monocyclic compounds, including melamine and triaminotriazine unit were chosen to recognize the dual AD-A motifs of the U-U mismatch. The insertion may occur through either the major or minor groove, and the symmetry of the hydrogen bonding motif depends on the type of binding mode (Figure 4-1). 96

Intercalators have been well studied in the context of nucleic acid binding and bind with high affinity at every other base pair in accordance with the neighbor exclusion rule. Lhomme and co-workers studied acridine-nucleobase conjugates connected through alkyl linkers and found them to prefer a π-stacked conformation in water in which they remained even when heated to 90 °C.5 The DNA duplex binding ability of these acridine-nucleobase conjugates was inversely proportional to linker length. Therefore, the conjugates of an acridine to the U-U recognition unit would be expected to yield compounds with higher affinity for poly(CUG)RNA because poly(CUG)RNA forms a stable A form duplex with many U-U mismatches in the structure.

O N N R

H

H

O

O

O

R

N

N O

N

O

O

H

H

H

O N N R

H O

H N

N N H

N N

H

H O

R

O

Minor groove O

O

H

N

N R

Major groove R N

N

O

R

N

N

H

H

O

N

N H

N R

N

N R

H

H

N

R

O

N N

N

N H

H

Figure 4-1. The symmetry of the hydrogen bonding motif (D-A-D) of the wedge melamine based on the binding mode (top). The heterocycle melamine hydrogen bonding within the U-U/ T-T mismatch (bottom). Based on the previous analogies, the melamine-acridine conjugate 1 was designed to bind U-U/ T-T mismatches as a stacked intercalator (Figure 4-2). The melamine was designed to provide selectivity through recognition of the two bases of the U-U mismatch, and the acridine unit was induced to provide high affinity via intercalation at the mismatch site. The short alkyl 97

linker conjugating the melamine and acridine was designed to optimize the π-stacking conformation that reduces affinity to the normal duplex. It was previously reported that 9aminoacridine and its derivatives have been shown to preferentially bind in the minor groove.6 Therefore, it is assumed that conjugate 1 binds through the minor groove and recognition of a UU (T-T) mismatch may occur through the major or minor groove.

Short linker for optimal π-stacked conformation H-Bonding array recognition unit for specificity (D-A-D)

H2N

N N

NH2 N

HN

NH OCH3

Cl

N

Large π-stacking surface for additional binding

1

Figure 4-2. The mode of binding for the proposed melamine-acridine conjugate (left). The possible interactions of melamine-acridine conjugate with U-U/T-T mismatches through the proposed binding mode (right).

4.3.2 Screening small molecules for inhibition of the poly(CUG)RNAMBNL1N complex Compounds 1-4 were designed and synthesized by, Dr. Arambula in the Zimmerman group to bind selectively to the U-U mismatches present in poly(CUG)RNA. The initial screening assay evaluated the inhibition of the (CUG)4-MBNL1N complex by compounds 1-4 using an electrophoretic mobility shift assay (EMSA). The sequences of all the RNAs used for inhibition studies are shown in Figure 4-3. The (CUG)4 and (CUG)12 RNAs form stable A form duplexes with U-U mismatches in the structures as shown in Figure 4-3.

98

All of the compounds used for the experiments were in solutions containing 10% DMSO. We performed a control experiment in 10% DMSO (lane 3) to confirm that the complex is stable under these conditions. Compounds 1 and 2 differ only in the length of the linker between the melamine and acridine units and 3 and 4 lacked the melamine unit (Figure 4-3). Compound 2 has a three carbon linker and minimally inhibited the complex, whereas compounds 3 and 4 which lack the triaminotriazine unit, did not inhibit the complex. Among all four compounds (1-4), compound 1 is the best inhibitor of the (CUG)4-MBNL1N complex. The results obtained for compounds 3 and 4 support the requirement of the melamine unit and the results obtained with compound 2 shows the important of linker length. All of these results confirm that the linker length and melamine units are playing a role in inhibiting the (CUG)4-MBNL1N complex. Actinomycin D is an anticancer agent, which was previously reported to bind to T-T mismatches in DNA.7 However, actinomycin D only minimally inhibited the (CUG)4-MBNL1N complex. Based on these results, compound 1 was selected for further investigations.

(a)

Figure 4-3. (a) Sequences of RNA used for the experimental studies.

99

Figure 4-3 (Cont.)

(b)

1

2

3

4

Figure 4-3. (b)Assay for screening molecules for destabilizing the (CUG)4-MBNL1N complex. Lane 1 is a control with only (CUG)4, lane 2 is (CUG)4 + MBNL1N complex, lane 3 is (CUG)4 + MBNL1N + 10% DMSO, and lane 4-8 contain (CUG)4 + MBNL1N + 83.3 µM of ligands 1, 2, 3, 4, and actinomycin D. The MBNL1N protein concentration used for the experiment is 200 nM (7.7 fold to KD of (CUG)4 with MBNL1N protein).

4.3.3 Binding of melamine-acridine conjugate with RNA mismatches and tRNA The specificity of compound 1 for U-U mismatches was studied using isothermal titration calorimetry (ITC). The affinity of compound 1 for single U-U, C-C, A-A, and G-G mismatches was compared. The ligand 1 exhibiting a 6-, 143-, and 143-fold higher affinity towards a U-U mismatch compared to single C-C, A-A, and G-G mismatches, respectively. The sequences of the RNA duplexes used for the experimental studies are shown in the Figure 4-4.

100

Figure 4-4. Sequences of RNA mismatches used for the isothermal titration calorimetry experiment. The affinity of compound 1 towards yeast tRNA was also studied to confirm the specificity of the molecule. All tRNAs have a clover leaf structure with 73-90 nucleotides, and three large hairpin loops. The affinity of compound 1 to tRNA is 10 µM from ITC, which is 23 fold weaker than that of 1 for (CUG)4 RNA. These results confirm that compound 1 is very specific in the recognition of U-U mismatches over other mismatches and prefers binding to (CUG)4 RNA over tRNA (Table 4-1). Table 4-1. Equilibrium dissociation constants for binding of ligand 1 with RNA mismatches and tRNA by isothermal titration calorimetry. RNA

KD (µM)

U-U mismatch C-C mismatch A-A mismatch G-G mismatch tRNA

2.1 ± 0.2 14 ± 2 >300 >300 11.4 ± 2.7

4.3.4 Inhibition of poly(CUG)RNA-MBNL1N complexes The initial screening was performed using (CUG)4 RNA but for the dose dependent inhibition studies we also used longer (CUG)12 repeat sequences, because we were interested in 101

identifying the effect of small molecules on longer repeats. The inhibition of (CUG)4-MBNL1N and (CUG)12-MBNL1N complexes by compound 1 was studied in the presence and absence of competitor tRNA by EMSA. The sequences of (CUG)4 and (CUG)12 RNA used for experiment are shown in Figure 4-3. The concentration of protein used for the inhibition experiments was based on the dissociation constant of poly(CUG)RNAs-MBNL1N complexes as described in chapter 3 (Section: 3.12.1). The concentration of protein used was that required for forming a 100% bound complex. The inhibition of the poly(CUG)RNA-MBNL1N complex was measured by titrating compound 1 into a fixed concentration of complex, and a representative titration curve, and the inhibition assays are shown in Figure 4-5. The IC50 and inhibition constant (Ki) were determined using the following equations. B = ∆Bexp((-0.69/IC50)C) + Bf………………1 Ki = IC50/(1+ (L/KD),………………………….2 B is the volume of the bound RNA band in the gel, ∆B is the difference between the volumes of the bound RNA bands at the beginning and end of the titration, C is the concentration of the small molecule, and IC50 is the concentration of small molecule at which B = 1/2∆B + Bf. The apparent inhibition constants (Ki) were calculated from the determined IC50 values using the Cheng and Prussof equation 211 where, L is the concentration of protein and KD is the dissociation constant of the MBNL1N-RNA complex. Compound 1 inhibited the complexes formed between MBNL1N protein and (CUG)4 and (CUG)12 in absence of tRNA with IC50 values of 52 ± 20 µM and 46 ± 7 µM, respectively, and Ki values of 6 ± 1 µM and 7 ± 1 µM, respectively. Inhibition by compound 1 was also studied in the presence of tRNA, because tRNA is ubiquitous in the cell and we were interested knowing the specificity of the small molecule at higher concentration of tRNA.

102

(C)

(A) 1 2

1 2

(C)

(B) 1 2

1 2

Figure 4-5. (A) Inhibition of the poly(CUG)RNA-MBNL1N complex in the absence and (B) the presence of tRNA. (C). The inhibition curves used to calculate the IC50 for the (CUG)nMBNL1N complexes. Lane 1 is a control with only (CUG)n + tRNA and lane 2 is (CUG)n + MBNL1N complex in presence of tRNA and no tRNA in any lane for the experiments in absence of tRNA. The blue bar represents the lowest and highest concentration of ligand 1 used for each experiment and the concentration in each well from 250 µM, 125 µM, 82.5 µM, 62.5 µM, 31.3 µM, 15.6 µM, 7.8 µM, to 3.9 µM, respectively.

103

Compound 1 inhibited the complexes formed between MBNL1N protein and (CUG)4 and (CUG)12 in the presence of competitor yeast tRNA with IC50 values of 42 ± 20 µM and 43 ± 1 µM, respectively, and Ki values of 11 ± 5 µM and 12.3 ± 0.3 µM, respectively. These values are not much different from those determined in the absence of competitor tRNA. The Ki value is higher than the binding constant of compound 1 with (CUG)4 RNA, which could be due to differences in the buffer conditions used for the experiments. Compound 1 specifically recognizes poly(CUG)RNA because the inhibition constant changed only by a factor of 2 even at 9000 fold higher concentration of tRNA compared to (CUG)12 RNA. These results suggest that the compound 1 is very specific and able to inhibit the poly(CUG)RNA-MBNL1N complexes in low micro molar range even in the presence of high concentration of tRNA. Table 4-2. Equilibrium dissociation constants for complexes formed between (CUG)4 and (CUG)12 and MBNL1N. IC50 and Ki values for the inhibition of these complexes with compound 1 in the absence and presence of competitor tRNA are listed. RNA (CUG)4 (CUG)12

tRNA (µM) – 0.2 – 0.9

KD (nM) 26 ± 4 69 ± 10 165 ± 9 370 ± 20

[MBNL1N] (µM) 0.2 0.2 0.93 0.93

IC50 (µM) 52 ± 20 42 ± 20 46 ± 7 43 ± 1

Ki (µM) 6±2 11 ± 5 7±1 12.3 ± 0.3

4.3.5 Specificity of the melamine-acridine conjugate (compound 1) The specificity of ligand 1 has been further evaluated by studying its affinity to destabilize two unrelated RNA-protein complexes, the U1A-SL2 and Sexlethal-tra RNA complexes. The U1A protein is a key component of the U1 small nuclear ribonuceoprotein (U1 snRNP) in the spliceosome. It binds to a single-stranded U1 snRNA stem loop 2 with very high affinity (KD ~10-11 M). The stem loop 2 RNA was selected to study the specificity of ligand 1 to hairpin loops, because it forms a stable hairpin loop with 10 bases in the loop region. The Sexlethal protein has high affinity for poly-uridine sequences found near the regulated 3’ splice site 104

of the transformer pre-mRNA (tra) (Figure 4-6). Both of these proteins use RNA recognition motifs (RRMs) for recognizing the RNA, which are unrelated to zinc finger domains present in the MBNL1N protein. No inhibition of either complex was observed in the presence of ligand 1, the result is quite surprising and supports the specificity results from the ITC experiments. The specificity of ligand 1 has also been studied for the inhibition of MBNL1N binding to its endogenous target (cTNT). The sequence of cTNT RNA is shown in Figure 4-3. cTNT contains one U-U mismatch, which could form a binding site for the small molecule. The dissociation constant of the cTNT RNA-MBNL1N complex is 1.4 ± 0.8 nM. The concentration of protein used for the inhibition assay was 10 nM. The Ki for the inhibition of the cTNT RNAMBNL1N complex by ligand 1 is 11.9 µM, which is just 2 fold higher than inhibition of the poly(CUG)RNA-MBNL1N complexes (Figure 4-7). These results suggests that the inhibition of the complex may be due to its interaction with the one U-U mismatch present in the cTNT RNA.

1 2

1 2

Figure 4-6. (a) Inhibition of the Sex lethal-tra RNA complex in the presence of tRNA (8 µM) by ligand 1. The concentration of 1 in each well are 125 µM, 83.3 µM, 62.5 µM, 31.3 µM, 15.6 µM, 7.8 µM and 3.9 µM. Lane 1 is a control with only tra RNA + tRNA and lane 2 contains the Sex lethal-tra RNA complex in the presence of tRNA. (b) Inhibition of the U1A-SL2 RNA complex 105

with ligand 1. The concentration of 1 in each well are 250 µM, 125 µM, 83.3 µM, 62.5 µM, 31.3 µM, 15.6 µM, 7.8 µM and 3.9 µM. Lane 1 is a control with only SL2 RNA and lane 2 contains U1A-SL2 RNA complex. (C). Sequences of tra RNA and SL2 RNA used for these experiments. (a) Inhibition of MBNL1N-cTNT complex (b) 125 µM 3.9 µM

1 2

Figure 4-7. (A) Inhibition of the MBNL1N-cTNT RNA complex by ligand 1. Concentrations of ligand 1 in each well: 125 µM, 83.3 µM, 62.5 µM, 31.3 µM, 15.6 µM, 7.8 µM and 3.9 µM. Lane 1 is a control with only cTNT RNA and lane 2 is cTNT RNA + MBNL1N complex. (b) The inhibition curve used to evaluate the IC50 for inhibition of MBNL1N-cTNT complex by ligand 1.

4.4 Inhibition of the poly(CCUG)RNA-MBNL1N complex 4.4.1 MBNL1N binding with (CCUG)6 RNA We selected (CCUG)6 for our experimental studies, and the sequence of this poly(CCUG)RNA is shown in Figure 4-3. The (CCUG)6 RNA was stabilized by the GGCG stem. The MBNL1N binding to (CCUG)6 was studied in the presence and absence of yeast tRNA. MBNL1N protein bound better to (CCUG)6 RNA compared to poly(CUG)RNAs. The (CCUG)6 structure is less stable compared to (CUG)12 RNA because it has a greater number of mismatches. The instability of the structure cause the formation more single stranded RNA, and MBNL1N prefers binding to single strand compared to double stranded RNA. The (CCUG)6 RNA has another possible structure in which the G-C base pairs are flanked with alternating U-U and C-C mismatches and MBNL1N may have a preference for this structure. These may contribute to the better binding affinity of the protein to (CCUG)6 over (CUG)12 RNA.

106

The dissociation constant of the MBNL1N protein-(CCUG)6 RNA complex is 15 ± 2 nM. The KD increased to 30 ± 4 nM in presence of 100 nM tRNA, showing that the binding affinity of the protein is decreased in presence of tRNA. The same trend was observed in binding studies with poly(CUG)RNAs (Chapter 4). Representative binding assays are shown in Figure 4-8.

Figure 4-8. Representative electrophoretic mobility shift assays of the MBNL1N protein binding to (CCUG)6 RNA in the (a) absence and (b) presence of tRNA (100 nM). The protein samples were 2 fold serially diluted, incubated with 32P-labeled RNAs at room temperature for 20 min, and electrophorsed using a 6% native polyacrylamide gel at 4 0C. The first lane from the left is free RNA and in the remaining lanes the slowest running band corresponds to the complex. The concentrations shown on the blue bar indicate lowest and highest concentrations of protein used for the experiment.

4.4.2 Ligand design Myotonic dystrophy type 2 (DM2) is caused by the sequestration of the MBNL1 protein by poly(CCUG)RNA. Molecules were designed and synthesized by Chun-Ho Wong from the Zimmerman group, to bind selectively to the U-U or C-U mismatches found in poly(CCUG)RNA. Ligand 1 possesses three essential features for inhibition of the complex: 1) a Janus-wedge (triaminotriazine or melamine with hydrogen bond donors), 2) the tetramethylene linker, and 3) acridine for intercalating into the RNA. Modification of each of these components might allow us to establish a SAR through binding and inhibition studies. Modification of these essential features resulted in the new molecules shown in Figure 4-9. In compounds 5 and 6 the wedge triaminotriazine was replaced by triaminopyrimidine, and in compound 7 the linker length 107

was increased from tetramethylene to pentamethylene. In compound 8 the aliphatic linker was substituted by an amide linker, and in compound 9 the tricyclic intercalator was swapped by a bicyclic one. Compounds 10-11 are synthetic intermediates in the synthesis of compounds 1-9 (Figures 4-3 and 4-9). The compounds numbered 5-9 are modified versions of ligand 1, which is the best hit in inhibiting the poly(CUG)RNA-MBNL1N complex. The compounds 10-13 are the Janus-wedge units without linked to acridine moiety and the compound 14 is the acridine with aliphatic amine. The proposed mode of binding of triaminopyrimidine and triaminotriazine wedges with U-U mismatches in the major and minor grooves is shown in Figure 4-10.

4.4.3 Screening of inhibition of the poly(CCUG)RNA-MBNL1N complex by small molecules The compounds screened for inhibition of poly(CCUG)RNA-MBNL1N complex are shown in Figure 4-9. The initial screening assay evaluated the inhibition of the (CCUG)6MBNL1N complex by EMSA using 350 nM protein concentration and the small molecules concentration is 100 µM. The concentration of the protein used is 23.3 fold of the KD of the complex of (CCUG)6 with MBNL1N protein, so that the protein is 100% bound form to the RNA. The compounds without an intercalator and compound 8 did not inhibit the CCUG)6MBNL1N complex. The best inhibiting compounds are 5 and 6, which have a triaminopyrimidine wedge unit and compound 14, which has no wedge unit (Figure 4-9). The ability of all of these compounds to inhibit the (CUG)12-MBNL1N complex was evaluated.

108

. Figure 4-9. Compounds evaluated for inhibiting the poly(CCUG)6RNA MBNL1N complex. The ligands in red boxes were the best hits for inhibition of (CCUG)6-MBNL1N complex

Figure 4-10. Proposed binding modes of triaminotriazine and triaminopyrimidine type ligands with the U-U mismatch with six hydrogen bonds from (a) the minor groove and (b) the major

109

groove. Corresponding modes for ligands binding to C-U mismatch with five hydrogen bonds from (c) the minor groove and (d) the major groove. Compound 5 inhibits the complex, but compound 6 does not. These results suggest that binding selectivity depends on the identity of the intercalator because in compound 6 the intercalator

9-amino-6-chloro-2-methoxyacridine

was

replaced

by

DACA

(N-(2-

(dimethylamino)ethyl)acridine-4-carboxamide). Compound 5 and 6 were further studied for dose dependent inhibition of (CCUG)6-MBNL1N in the presence and absence of competitor tRNA. Compound 1 was also studied for dose dependent inhibition of (CCUG)6-MBNL1N complex. The inhibition of (CUG)12-MBNL1N and S10-MBNL1N complexes was also studied to investigate the specificity of these molecules

(a) Inhibition of MBNL1N-(CCUG)6 Complex

(b) Inhibition of MBNL1N-(CUG)12 Complex

Figure 4-11. Screening for the inhibition of the (a) (CCUG)6-MBNL1N by ligands and (b) (CUG)12-MBNL1N complex by small molecules. Lane 1 is a control with only (CCUG)6 or (CUG)12, lane 2 is (CCUG)6 or (CUG)12 + MBNL1N complex, lane 3 is (CCUG)6 or (CUG)12 + MBNL1N complex + 10% DMSO, and from lane 4-14 is inhibition of complex with 100 µM of ligand 1, and 5-14, respectively. The MBNL1N protein concentration used for the (CCUG)6 and (CUG)12 screening assays is 350 nM and 1.5 µM, respectively.

110

4.4.4 Inhibition of the (CCUG)6-MBNL1N complex The inhibition of (CCUG)6-MBNL1N complex by compounds 1, 5, and 6 was studied in the presence and absence of competitor tRNA by EMSA. The concentration of protein used for inhibition experiments was based on the dissociation constant of (CCUG)6 with MBNL1N protein and the binding data is shown in section 4.4.1. The inhibition of the (CCUG)6-MBNL1N complex was measured by titrating compounds into a fixed concentration of complex Compound 1 did not show any dose dependent inhibition of the complex (Figure 4-12). In contrast, compounds 5 and 6 inhibited the complexes formed between MBNL1N protein and (CCUG)6 with IC50 values of 59 ± 5 µM and 52 ± 8.4 µM, respectively, and Ki values of 2.4 ± 0.2 µM and 2.2 ± 0.3 µM, respectively. These values did not significantly change in the presence of competitor tRNA (Table 4-3). The Ki values for the compounds 5 and 6 are almost identical, which indicates that they inhibit the complex with same efficiency and they retain their activity at high concentrations of tRNA.

4.4.5 Specificity of compounds 5 and 6 The specificity of ligands 5 and 6 was further evaluated by studying their ability to destabilize the (CUG)12-MBNL1N and S10-MBNL1N complexes. (CUG)12 is the target in DM1 and S10 is a fragment of cTNT RNA, which is the natural target for the MBNL1N protein. The sequences of RNA used for these experiments are shown in Figure 4-3. The dose dependent inhibition assays for both compounds are shown in Figure 4-13. The compounds inhibit both the complexes at higher concentrations, but compound 6 is more specific than 5. Compound 5 inhibited the (CUG)12-MBNL1N and S10-MBNL1N complexes with Ki values of 4.8 ± 1.4 µM and 6.3 ± 1.6 µM, respectively, while compound 6 inhibited (CUG)12-MBNL1N and S10MBNL1N complexes with Ki values of 15.4 ± 5.1 µM and >15.6 µM, respectively (Table 4-3). 111

Ligand 6 has more than 7 fold specificity for inhibiting the (CCUG)6-MBNL1N complex over the S10-MBNL1N complex. This is an exciting result because the compound specifically inhibits the MBNL1N complex formed with pathogenic RNA without affecting the complex formed with natural target.

Figure 4-12. Inhibition of MBNL1N-(CCUG)6 complex by (a) compound 1 (b) compound 5 and (c) compound 6, respectively. The (d) and (e) are inhibition of the complex by compounds 5 and 6 in presence of tRNA, respectively. The blue bar represents the lowest and highest concentrations of ligands used for each experiment and the concentration in each well is from 250 µM, 125 µM, 82.5 µM, 62.5 µM, 31.3 µM, 15.6 µM, 7.8 µM, and 3.9 µM, respectively.

112

Table 4-3. Equilibrium dissociation constants for binding of MBNL1N with (CCUG)6, (CUG)12, and S10 RNAs; and IC50 and Ki values for inhibition of these complexes by compounds 5 and 6 in the absence and presence of competitor tRNA. The compound 6 has better specificity and the values are high-lighted in red. RNA

(CCUG)6 (CUG)12 S10

tRNA (nM) 100 -

KD (nM) 15 ± 2 30 ± 4 150 ± 20 1 ± 0.2

[MBNL1N] (nM) 350 350 1000 15

IC50 (µM)

IC50 (µM)

Ki (µM)

Ki (µM)

5

6

5

6

59 ± 5 52 ± 8.4 39.9 ± 11.5 35 ± 3.4 36.5 ± 10.6 118 ± 39.6 100.6 ± 26.1 >250

2.4 ± 3.2 ± 4.8 ± 6.3 ±

0.2 2.2 ± 0.3 0.9 2.8 ± 0.3 1.4 15.4 ± 5.1 1.6 > 15.6

Figure 4-13. Inhibition of S10-MBNL1N complex by (a) compound 5 (b) compound 6, respectively. Inhibition of (CUG)12-MBNL1N complex by (c) compound 5 (d) compound 6, respectively. The blue bar represents the lowest and highest concentrations of ligands used for each experiment and the concentration in each well is from 250 µM, 125 µM, 82.5 µM, 62.5 µM, 31.3 µM, 15.6 µM, 7.8 µM, and 3.9 µM, respectively.

4.5 Conclusion Compound 1 is the first small molecule to selectively inhibit the poly(CUG)RNAMBNL1N complex. Compound 1 has high affinity for RNA duplexes with U-U mismatch, but binds with much lower affinity to RNA duplexes with C-C or A-A or G-G mismatches. Low micromolar concentrations of compound 1 destabilize complexes formed between toxic poly(CUG)RNA sequences and MBNL1N in the presence of competitor tRNA. The compound 113

has very high selectivity because it did not inhibit the (CCUG)6-MBNL1N complex, even though this RNA has C-U mismatches, which is very similar to U-U mismatches. Compound 1 did not destabilize unrelated RNA-protein complexes (U1A-SL2 and Sex lethal-tra RNA).8 Compounds 5 and 6 successfully destabilized the complex formed between (CCUG)6 and the MBNL1N protein. Compound 6 is more specific than compound 5. The Ki values are not affected by presence of high concentrations of tRNA. These compounds have ≈ 3-fold better Ki in inhibiting the complex formed between MBNL1N protein with pathogenic RNAs than compound 1. Compound 6 has more than 7-fold selectivity in destabilizing the MBNL1N protein complex with (CCUG)6 compared to natural target S10 RNA. Thus compounds 1 and 6 are potential lead compounds for targeting DM1 and DM2, respectively. The selectivity and affinity of these compounds can be increased by modifying or replacing the acridine or triaminotriazine or triaminopyrimidine units. Affinity to longer repeats can be achieved by oligomerization of the compounds.

4.6 Experimental section 4.6.1 Materials and methods Purified RNA sequences were obtained from Dharmacon Research. Yeast tRNA was purchased from Sigma-Aldrich. Acrylamide and bisacrylamide solutions used in native and denaturing gels were purchased from National Diagnostics, Inc. γ-32P ATP was purchased from Amersham Biosciences Corp. An expression vector for MBNL1N (1-272) was obtained from Maurice S. Swanson (University of Florida College of Medicine, Gainesville, FL).

4.6.2 Inhibition assay Gel mobility shift assays or EMSA were used to determine the IC50 and Ki of small molecules for RNA-protein complexes. RNAs were labeled with [γ-32P] ATP using T4 114

polynucleotide kinase (New England Biolabs). Labeled RNA sequences were heated at 95 0C for 2 min and then placed on ice for 10 min in RNA storage buffer (66 mM NaCl, 6.7 mM MgCl2, and 27 mM Tris (pH 7.5). The protein (MBNL1N) was serially diluted in binding buffer (175 mM NaCl, 5 mM MgCl2, 20 mM Tris (pH 7.5), 1.25 mM BME, 12.5% glycerol and 2 mg/mL BSA, and 0.1 mg/mL heparin),9 and 5 µL of protein solution was added to each 5 µL (0.2 nM) aliquot of RNA. The reaction mixture was incubated at room temperature for 25 min, after which the small molecule was added to the RNA-protein complex and incubated for 10-15 minutes at room temperature. The samples were loaded onto a 6% polyacrylamide gel (80:1) prechilled at 4 0

C. The gels were run for 1 hr at 360V in 0.5 X Tris-Borate buffers and dried. The gels were

exposed to a phosphorimager screen overnight and individual bands were quantified using ImageQuant software (Molecular Dynamics) and the data was fit using KaleidaGraph 3.5 (synergy) software. The bound ligand (cpm) versus small molecule concentration was evaluated using the following equation to determine IC50 values.10 B = ∆Bexp((-0.69/IC50)C) + Bf B is the volume of the bound RNA band in the gel, ∆B is the difference between the volumes of the bound RNA bands at the beginning and end of the titration, C is the concentration of the small molecule, and IC50 is the concentration of small molecule at which B = 1/2∆B + Bf. The apparent inhibition constant (Ki) were calculated from the determined IC50 values using the Cheng and Prussof equation:11 Ki = IC50/(1+ (L/KD), where L is the concentration of protein and KD is the dissociation constant of the MBNL1N-RNA complex. The specificity of ligand 1 was studied using the above procedure except for the following changes. The U1A and sex lethal proteins were expressed as hexa-his constructs and were purified using a Ni-NTA column from Qiagen. Gel shift assays for the U1A-SL2 RNA

115

were performed as reported previously.12 The 32P-labeled SL2 RNA (0.2 nM) was incubated with 200 nM U1A protein for 30 min at room-temperature in a buffer containing 10 mM Tris-HCl, pH 7.4, 0.5% Triton X-100, 1 mM EDTA, and 250 mM NaCl. After the addition of glycerol to a final concentration of 5%, the bound and free RNA were separated using a 8% polyacrylamide gel at 25 °C (80:1 acrylamide/ bisacrylamide,) in 100 mM Trisborate pH 8.3, 1 mM EDTA, 0.1% Triton X-100 for 30 min at 360 V. Sex lethal-tra RNA complex inhibition assays were performed in 15 mM HEPES, pH 7.6, 50 mM potassium chloride, 1 mM EDTA, 1 mM β-mercaptoethanol, 20 % glycerol, and 0.005% Triton-X, with 0.2 µg/µL t-RNA with 300 nM Sex lethal protein and 0.2 nM 32P-labeled tra RNA.

4.6.3 Isothermal titration calorimetry experiments ITC experiments were performed at 25 0C on a MicroCal VP-ITC (MicroCal, Inc., Northampton, MA). A standard experiment consisted of titrating 10 or 20 µM of RNA duplex (1.42 mL in sample cell) with 10 µL of 500 µM of ligand solution in syringe. The standard experiment was accompanied by a corresponding control experiment in which aliquots (10 µL) of ligand 1 (500 µM) titrated into buffer (20 mM MOPS pH=7, 300 mM NaCl) alone. The duration of each injection was 24 s, and the spacing between two injections was 300 s. The initial delay prior to first injection was 60 s. The instrument measured the heat released for each injection in µcal/sec. The heat associated with each injection was measured by determining the area under the curve by using Origin version 5.0 software (Microcal, Inc., Northampton, MA, USA). The heat of ligand binding for each injection was determined by subtracting ligand solvation (Buffer titrated with ligand) from the corresponding heat associated with ligand-RNA injection to yield the heat due solely to ligand binding for each injection. Binding constants were

116

determined from plots of the heat of ligand binding as a function of ligand-RNA molar ratio. The graph was fit using a sequential two-site binding model.

Figure 4-14. Plots of data representing the binding of ligand 1 with different mismatch RNA duplexes and tRNA by ITC experiment. The experimental conditions and the sequence of RNA used are shown on the left side of each binding data curve. 117

Figure 4-14 (Cont.)

Figure 4-14. Plots of data representing the binding of ligand 1 with different mismatch RNA duplexes and tRNA by ITC experiment. The experimental conditions and the sequence of RNA used are shown on the left side of each binding data curve.

118

Figure 4-14 (Cont.)

Figure 4-14. Plots of data representing the binding of ligand 1 with different mismatch RNA duplexes and tRNA by ITC experiment. The experimental conditions and the sequence of RNA used are shown on the left side of each binding data curve.

4.7 References 1. Mooers, B. H. M.; Logue, J. S.; Berglund, J. A. The structural basis of myotonic dystrophy from the crystal structure of CUG repeats. Proc. Natl. Acad. Sci. USA. 2005, 102, 16626-16631. 2. Wheeler T. M.; Sobczak, K.; Lueck, J. D.; Osborne, R. J.; Lin, X.; Dirksen, R. T.; Thornton, C. A. Reversal of RNA dominance by displacement of protein sequestered on triplet repeat RNA. Science. 2009, 325, 336-339.

119

3. Pushechnikov, A.; Lee, M. M.; Childs-Disney, J. L.; Sobczak, K.; French, J. M.; Thornton, C. A.; Disney, M. D. Rational design of ligands targeting triple repeating transcripts that cause RNA dominant disease: application to muscular dystrophy type 1 and spinocerebellar ataxia type 3. J. Am. Chem. Soc. 2009, 131, 9767–9779. 4. Gareiss, P. C.; Sobczak, K.; McNaughton, B. R.; Palde, P. B.; Thornton, C. A.; Miller, B. L. Dynamic combinatorial selection of molecules capable of inhibiting the (CUG) repeat RNAMBNL1 interaction in vitro: discovery of lead compounds targeting myotonic dystrophy (DM1). J. Am. Chem. Soc. 2008, 130, 16254–16261. 5. Constant, J. F.; Laûgaa, P.; Roques, B. P.; Lhomme, J. The acridine ring selectively intercalated into a DNA helix at various types of abasic sites: double strand formation and photophysical properties. Biochemistry 1988, 27, 3997−4003. 6. Lerman, L. S. Structural considerations in the interactions of deoxyribonucleic acid and acridines. J. Mol. Biol. 1961, 3, 18−30. 7. Lian, C. Y.; Robinson, H.; Wang, A. H. J. Structure of actinomycin D bound with (GAAGCTTC)2 and (GATGCTTC)2 and its binding to the (CAG)n:(CTG)n triplet sequence as determined by NMR analysis. J. Am. Chem. Soc. 1996, 118, 8791–8801. 8. Arambula, J. F.; Ramisetty, S. R.; Baranger, A. M.; Zimmerman, S. C. A simple ligand that selectively targets CUG trinucleotide repeats and inhibits MBNL binding. Proc. Natl. Acad. Sci. USA, 2009, 106, 16068-16073. 9. Warf, M. B.; Berglund, J. A. MBNL binds similar RNA structures in the CUG repeats of myotonic dystrophy and its pre-mRNA substrate cardiactroponin T. RNA 2007, 13, 2238–2251.

120

10. Luedtke, N. W.; Liu, Q.; Tor, Y. RNA−ligand interactions: affinity and specificity of aminoglycoside dimers and acridine conjugates to the HIV-1 Rev response element. Biochemistry 2003, 42, 11391-11403. 11. Cheng, Y.; Prusoff, W. H. Relationship between the inhibition constant (Ki) and the concentration of an inhibitor which causes 50 per cent inhibition (IC50) of an enzymatic reaction. Biochem. Pharmacol. 1973, 22, 3099-3108. 12. Shiels, J. C.; Tuite, J. B.; Nolan, S. J.; Baranger, A. M. Investigation of a conserved stacking interaction in target site recognition by the U1A protein. Nucleic Acids Res. 2002, 30, 550–558.

121

Suggest Documents