Identification of a Novel Family of Proteins in Snake Venoms: Purification and Structural Characterization of Nawaprin from

JBC Papers in Press. Published on July 23, 2003 as Manuscript M305322200 Identification of a Novel Family of Proteins in Snake Venoms: Purification a...
Author: Toby Palmer
1 downloads 0 Views 1MB Size
JBC Papers in Press. Published on July 23, 2003 as Manuscript M305322200

Identification of a Novel Family of Proteins in Snake Venoms: Purification and Structural Characterization of Nawaprin from Naja nigricollis Snake Venom*

Torres, A. M.§, Wong, H. Y.¶, Desai, M.‡, Moochchala, S.†, Kuchel, P. W.§ and Kini, R. M.‡**

§

School of Molecular and Microbial Biosciences, University of Sydney, NSW Australia 2006 Department of Pharmacy, Faculty of Science, National University of Singapore, Singapore 119260 † Defence Medical Research Institute, Defence Science and Technology Agency, Singapore 117597 ‡ Department of Biological Sciences, Faculty of Science, National University of Singapore, Singapore 117543 ¶

Running title: A new family of snake venom protein

Keywords: elafin, whey acidic proteins, NMR structure

** Address for correspondence: R. Manjunatha Kini Department of Biological Sciences, Faculty of Science National University of Singapore 14, Science Drive 4 Singapore, 117543 Singapore Phone: 65-6874-5235 Fax: 65-6779-2486 Email: [email protected]

Copyright 2003 by The American Society for Biochemistry and Molecular Biology, Inc.

2

SUMMARY The three-dimensional structure of nawaprin has been determined by nuclear magnetic resonance spectroscopy. This 51 amino acid residue peptide was isolated from the venom of the spitting cobra, Naja nigricollis, and is the first member of a new family of snake venom proteins referred to as waprins.

Nawaprin is relatively flat and disc-like in shape,

characterized by a spiral backbone configuration that forms outer and inner circular segments. The two circular segments are held together by four disulfide bonds, three of which are clustered at the base of the molecule. The inner segment contains a short antiparallel β-sheet while the outer segment is devoid of secondary structures except for a small turn or 310 helix. The structure of nawaprin is very similar to elafin, a human leukocyte elastase specific inhibitor. Although substantial parts of the nawaprin molecule are welldefined, the tips of the outer and inner circular segments, which are hypothesized to be critical for binding interactions, are apparently disordered, similar to that found in elafin. The amino acid residues in these important regions in nawaprin are different from those in elafin suggesting that nawaprin is not an elastase specific inhibitor and therefore has a different function in the snake venom.

3

INTRODUCTION Snake venoms are rich sources of pharmacologically active polypeptides and proteins. Some of these proteins exhibit enzymatic activities. These enzymes include phospholipase A2, proteinase, nucleotidase, phosphodiesterase, and L-amino acid oxidase. In addition to their catalytic properties that may contribute to the digestive action of the venom, these enzymes also induce various pharmacological effects including neurotoxic, myotoxic, cardiotoxic, hemorrhagic, hemolytic, procoagulant and anticoagulant effects (1, 2). Several other snake venom proteins and polypeptides do not exhibit these and other enzymatic activities and thus are described as ‘nonenzymatic proteins’. These proteins include neurotoxins, cardiotoxins, myotoxins, ion channel inhibitors and anticoagulant proteins (3, 4). Thus snake venom proteins, whether they are enzymatic or nonenzymatic, have evolved as a complex mixture of proteins that target several tissues, organs and physiological systems and interfere in their normal functions. Therefore snake venoms when injected into a prey or victim result in the simultaneous assault on various tissues leading to multiple organ or system failure and often death. A large number of protein toxins have been purified and characterized from snake venoms. These studies have shown that each venom contains over a hundred protein toxins. These toxins, however, belong to a very small number of superfamilies of proteins. For example, a single snake venom can contain as many as 15 isoforms of phospholipase A2 (5-7). As one would expect, they share remarkable similarities in their primary, secondary and tertiary structures. However, at times they differ from each other in their biological targeting and hence their pharmacological effects. Similarly, other enzymes as well as nonenzymatic proteins in snake venoms also exist in many isoforms (8) and can be classified protein families. So far more than 1000 nonenzymatic proteins have been characterized and these protein toxins are grouped into well-recognized families as follows : (1) three-finger toxins (including neurotoxins and cardiotoxins); (2) serine proteinase inhibitors (including proteinase inhibitors and dendrotoxins); (3)lectins; (4) sarafatoxins; (5) nerve growth factors; (6) atrial natriuretic peptides; (7) bradykinin-potentiating peptides; (8) disntegrins; and (9) helveprins/CRISP (8-11). The members in each family of protein toxins have a similar molecular scaffold but they exhibit multiple functions. Thus it appears that during the evolution of venoms some of the molecular scaffolds have been ‘selected’ and various ‘functional sites’ were generated by accelerated evolution common molecular scaffold. We

4

are interested in the structure-function relationships of various families of toxins from snake and other venoms. Many of the early efforts in venom research were directed towards the isolation and characterization of either proteins that are found in abundance or the most toxic components. With the advent of more sophisticated purification techniques there have been studies of new and interesting protein components that are found in smaller quantities. In this paper, we describe a novel toxin that is a member of a new family of snake venom toxins. Thus, we have isolated and purified nawaprin, the first member of this family, from Naja nigricollis venom. The complete amino acid sequence and the solution structure of this toxin have been determined. Nawaprin is structurally similar to secretory leukocyte proteinase inhibitor (SLPI)1 and elafin, whose tertiary structure have been studied by NMR (12) and X-ray crystallography (13). Both nawaprin and elafin contain four disulfide bonds and several proline residues. Elafin is a specific inhibitor of human leukocyte elastase and porcine pancreatic elastase, the former of which was first obtained from exfoliated skin (scales) of patients with psoriasis (14, 15). This new protein fold has also been used as a scaffold in the evolution of snake venom toxins and may be useful in the engineering of proteins with novel pharmacological actions.

5

EXPERIMENTAL PROCEDURES Materials Lyophilized crude Naja nigricollis venom was obtained from Miami Serpentarium Laboratories (Miami, FL, USA). Trypsin endopeptidase was purchased from Wako Pure Chemicals (Osaka, Japan). 4-Vinylpyridine was obtained from Sigma Chemical Co. (St. Louis, MO, USA). Superdex 30 and Sephasil C18 columns were obtained from Pharmacia Biotech (Uppsala, Sweden). Isolation and purification of nawaprin from Naja nigricollis snake venom Nawaprin was purified by a 3-step purification process protocol, gel filtration of venom on a Superdex 30 column was followed by ion exchange chromatography on a UNO S6 column and HPLC on a Jupiter C18 column. Crude venom (200 mg) was loaded onto a Superdex 30 column (HiLoadTM 16/60) equilibrated with 50 mM Tris-HCl buffer, pH 7.5. The proteins were eluted with the same buffer at a flow rate of 1 ml/min on a fast performance liquid chromatography (FPLC) system (Pharmacia Biotech, Uppsala, Sweden). The protein elution was monitored at 280 nm. The fraction with the peak of interest (~2-5 mg) was applied separately onto a UNO S6 cation exchange column (Bio-Rad, Hercules, CA, USA), preequilibrated with 50 mM Tris-HCl buffer, pH 7.5 (Buffer A). The bound proteins were eluted by a linear gradient of 1 M NaCl in Buffer A. Protein elution was carried out at a flow rate of 2 ml/min, and monitored at 280 nm. The unbound fraction from the UNO S6 column was loaded onto a Jupiter C18, 10µ (10 mm/250 mm) column equilibrated with 0.1% trifluoroacetic acid (TFA) on Vision Workstation (Perkin-Elmer Biosystems, CA, USA). The bound proteins were eluted using a linear gradient of 80% acetonitrile (ACN) in 0.1% (v/v) TFA at a flow rate of 2 ml/min. The elution of proteins was monitored at 215 nm. Reduction and pyridylethylation Purified protein was reduced and pyridylethylated using procedures described earlier (16). Protein (0.5 mg) was dissolved in 500 µL of denaturant buffer 6 M Guanidium hydrochloride, 0.25 M Tris-HCl, 1 mM EDTA, pH 8.5. After the addition of 10 µL of βmercaptoethanol, the mixture was incubated under vacuum for 2 h at 37º C. 4-Vinyl pyridine (50 µL) was added to the mixture and kept at room temperature for 2 hours. Pyridylethylated protein was purified on a µ-RPC C2/C18 (2.1 mm/10 mm) column using ACN in 0.1% (v/v) TFA at a flow rate of 200 µL/min.

6

Chemical and enzymatic cleavage Peptides of pyridylethylated protein were obtained by chemical cleavage using formic acid (Asp-specific) as described by Inglis (17). Briefly, the desalted protein sample (500 µg) was dissolved in 2% formic acid in a glass vial and then frozen. Subsequently, under vacuum, the vial was thawed at room temperature and then sealed off. The vial was then heated at 108°C for 2 h, and allowed to cool to room temperature. Peptide digestion of the pyridylethylated protein was also obtained by enzymatic cleavage with trypsin. Pyridylethylated protein (300 µg) was dissolved in 300 µL of 100 mM ammonium bicarbonate buffer and digested overnight by trypsin at 37º C. The peptides generated by both formic acid and tryptic digestion were separated by RP-HPLC on a Sephasil C18 (5µ, 2.1 mm/10 mm) column, equilibrated with 0.1% (v/v) TFA. A linear gradient of 80% (v/v) ACN in 0.1% TFA (v/v) was used to elute bound peptides. Mass spectrometry The protein fractions eluted from the columns were screened for novel molecular weight peptides using matrix-assisted laser desorption/ionization – time of flight mass spectrometry (MALDI-TOF MS on a Voyager DE-STR Biospectrometry Workstation (Applied Biosystems, Foster City, CA, USA). Typically, 1 – 5 pmol/µl of the sample was cocrystallized with an equal volume of the matrix [10 mg/mL of α-cyano-4-hydroxycinnamic acid freshly prepared in 1:1 ACN:water containing 0.3% (v/v) TFA] on a 100-well stainless steel sample plate. The accelerating voltage was set at 25000 V, the grid voltage at 93.0%, the guide wire voltage at 0.3%. Molecular ions were generated using a nitrogen laser (wavelength, 337 nm) at an intensity of 1800–2200. Extraction of ions was delayed by 800 ns. The spectrum obtained by averaging several scans. The spectrum was calibrated using external molecular weight standards. Precise masses (0.01%) of the native protein and peptides were determined by electrospray ionization mass spectrometry (ESI-MS) using a Perkin-Elmer Sciex API 300 LC/MS/MS system. Typically, RP-HPLC fractions were directly used for analysis; alternatively, samples were prepared by dissolving desalted, lyophilized samples in 1:1:1 ACN:methanol:water (v/v/v) containing 1% acetic acid. The samples were delivered either by direct infusion or flow injection. Ionspray, orifice and ring voltages were set at 4600 V, 50 V and 350 V, respectively. Nitrogen was used as the nebulizer and curtain gas. An LC-10AD Shimadzu

7

Liquid Chromatograph was used for solvent delivery [40% (v/v) ACN in 0.1% TFA]. The software Biomultiview (Perkin-Elmer Sciex) was used to analyze and deconvolute the raw mass spectrum. Amino terminal sequencing Amino terminal sequencing of the native and pyridylethylated protein as well as peptides was performed by automated Edman degradation using a Perkin-Elmer Applied Biosystems 494 pulsed-liquid phase protein sequencer (Procise) with an on-line 785A PTH-amino acid analyzer.

NMR spectroscopy The NMR sample was prepared by dissolving 3.1 mg of the lyophilized peptide in 0.350 mL of 90% H2O/10% D2O in a 5 mm Shigemi (Allison Park, PA, USA) NMR tube, resulting in a final protein concentration of 1.7 mM and pH 3.1. NMR experiments were performed on a Bruker (Karlsruhe, Germany) AVANCE-600 DRX 1

spectrometer using a 5-mm H inverse probe operating at temperatures of 10, 20, 25, 30 and 35°C. Two-dimensional (2D) NMR spectra were acquired in phase-sensitive mode using time-proportional phase detection (18). Homonuclear 2D spectra recorded were doublequantum filtered (DQF-) COSY (19) with a fast recycle time (20), TOCSY (21), with a spinlock period of 60 ms, and NOESY (22) with a mixing time of 200 ms. Solvent-signal suppression was achieved either by presaturation or by using the WATERGATE (23) pulse sequence. H-D exchange experiments were carried out by reconstituting the freeze-dried sample with D2O, acquiring series of 1D spectra for 15 min, and then acquiring two 1 h TOCSY spectra. All spectra were processed using XWIN-NMR software (Bruker) and were analyzed using the program XEASY (24). Structural calculations The final structure was obtained by using restraints consisting of 503 non-redundant NOEderived distances, 18 hydrogen bonds, 9 φ dihedral angles and 4 disulfide bonds (see Table 1). The NOESY spectrum recorded at 25°C with a mixing time of 200 ms provided the NOE

8

constraints. The H-D exchange experiments, and the preliminary calculated structures were used in deducing hydrogen-bonding pairs. The hydrogen-bonding constraints were assigned upper distance-limits of 2.2 Å for NHi to Oj and 3.2 Å for Ni to Oj. The disulfide bond configuration was determined from the characteristic NOE interactions between the α and β protons of two-paired cysteine residues. The standard simulated annealing procedure in DYANA was employed to obtain preliminary structures prior to refinement. An iterative cycle of calculations, structure analysis, manual assignment, and constraint revision was implemented to improve the quality of calculated structures. The final DYANA calculation yielded 3000 structures, 60 of which (with the lowest NOE violations) were selected for refinement by using the standard simulated annealing script in CNS (25). In this refinement process, the high-temperature dynamics and cooling cycle were performed in Cartesian space. The 20 structures with the lowest overall energy were considered as representative of nawaprin. Secondary structures in nawaprin were determined using MOLMOL (26).

9

RESULTS Purification of a novel protein from Naja nigricollis venom Gel filtration of the crude venom of Naja nigricollis on a Superdex 30 column yielded eight major peaks (Figure 1A). Since our interest lay in isolating small novel peptides (4000 to 9000 Da), we searched for polypeptides which had masses that are distinctly different from the well established toxin families, using MALDI-TOF MS (data not shown). Peak 4 had mass of ~5290. The molecular size was less than three-finger toxins and serine proteinase inhibitors but larger than atrial natriuretic peptides (8). Thus, based on its mass, we had identified a polypeptide belonging to no other known family of snake venom proteins. Proteins in peak 4 were further separated on a cation exchange column, UNO S6 (Figure 1B). The protein of interest did not bind to the column; it was eluted in the unbound fraction. The protein was further purified on a reverse-phase column, Jupiter C18 (Figure 1C). The major peak in the HPLC chromatogram had a molecular weight of 5288.50 ± 0.08 by ESI-MS (Figure 1D). The overall yield of the protein varied, from batch to batch of venom samples, between 0.09% and 0.51% (n = 8). Determination of the amino acid sequence N-terminal sequencing of the native protein was achieved by Edman degradation and it resulted in the identification of first 34 residues (Figure 2A). To complete the sequence, the pyridylethylated protein and the two peptides F1 and F2 purified from the formic acid digest were analyzed (data not shown). All of the 51 residues, except 48th residue were unequivocally identified. To confirm the sequence, the pyridylethylated protein was digested with trypsin and the tryptic peptides were then purified (data not shown). The carboxy terminal peptide was identified, based on its mass and sequence. Then we could identify the 48th residue as Thr. Hence, this novel protein contains 51 residues, including eight cysteine residues (Figure 2A). The calculated molecular weights of the native and pyridylethylated proteins were 5288.12 (with the assumption that all eight cysteine residues are involved in disulfide bond formation) and 6137.37, respectively; these matched the estimated masses determined by ESI-MS (Table 1). BLAST search for sequence homology indicated that this protein belongs to the family of whey acidic proteins (WAP) (Figure 2B). Since then, we have isolated and purified two other peptides from snake venoms which show similar mass and amino acid sequence. Although all the cysteine residues are conserved in these proteins, the intercysteine segments are distinctly different2. Because of their homology with WAPs,

10

we have named this new family of snake venom proteins as Waprins (WAP related proteins) and the protein from Naja nigricollis venom as Nawaprin (Naja waprin). WAPs were the first members of this family to be isolated. They are small secretory proteins widely distributed in the whey of many species (27-29). They contain two 4-disulfide core domains. Waprins are structurally closer to the epididymal secretory protein members of the WAP family (Figure 2B). These proteins are specifically expressed in vas deferens and the distal epididymis (30). Trappins (Transglutaminase substrate and WAP domain containing proteins) of the WAP family are ‘trapped’ in the tissues through covalent crosslinking (3133). They contain an amino terminal transglutaminase substrate domain [also called the cementoin domain (34)] with a variable number of hexapeptide repeats with the consensus sequence GQDPVK. Trappins anchor the biologically active WAP motifs at appropriate sites in the extracellular matrix through this domain.

Elafin/SKALP (skin derived

antileukoproteinase), SPAI-2 (Na+/K+-ATPase inhibitor) and porcine WAP-3 are some of the members of the trappin family, although elafin and SPAI were first isolated as soluble proteins (35, 36). In contrast, some of the WAP proteins such as SLPI and human seminal plasma inhibitor (HUSI-1) do not have the cementoin domain and are produced as secreted proteins (37-39).

WAPs and other family members have one to three similar domains,

whereas waprins contain a single four-disulfide core domain and are found in snake venoms in soluble form. Determination of solution structure The NOESY spectrum of nawaprin obtained at 25 ºC and pH 3.1 showed wide-dispersion of amide proton signals indicating β-sheet secondary structures (see Figure 3). The analysis of the spectra was, however, not straightforward as it was made difficult by a number complicating factors: These included the presence of seven proline residues; excessive line broadening for a number of peaks; and an extra set of small peaks suggesting the presence of minor conformations of the peptide in solution. The seven proline residues in nawaprin presented a major difficulty in resonance assignment since it led to peak overlap in appropriate regions of the spectra. All proline residues displayed strong dαδ(i-1,i) connectivities suggesting that they were mainly in the transconformation (40), but still the existence of minor cis-conformations could not be completely

11

discounted. For example, Pro31 showed an additional dαα(i-1,i) cross-peak although its intensity is very weak. This could probably explain the presence of minor sets of peaks in the spectra which could not be readily assigned to a specific residue in the sequence. The presence of minor conformation(s) in solution was confirmed by RP-HPLC of the re-purified sample wherein two minor peaks, whose intensities were ~5% of the large (major) peak, were detected. Beside these unwanted factors, broad peaks were also observed for many backbone amide protons that may suggest intermediate chemical exchange of protons with the aqueous solvent, slow conformational averaging, or flexibility in the molecule. Moreover, the backbone amide peaks of Lys21, Leu23, Cys41, Met44 and Thr45 were split, suggesting slow-to-medium conformational exchange. In the final analysis, the difficulties encountered in the resonance assignments were resolved by performing several homonuclear 2D experiments at different temperatures and pH. The 34% sequence similarity between elafin and nawaprin, and the eight conserved cysteine residues, suggest equivalent cysteine pairing patterns (Figure 2C). This was confirmed by NMR based on the characteristic NOEs between α and β protons of the bonded cysteine pairs. NOE connectivities that were observed included: Cys30-Hα-Cys46-Hβ, Cys30-HβCys46-Hα, Cys30-Hβ-Cys46-Hβ, Cys7-Hβ-Cys37-Hα, Cys7-Hβ-Cys37-Hβ, and Cys24-HαCys36-Hβ. The NOE cross peaks linking Cys20-Cys41 were not observed, probably due to rather broad lines in the corresponding part of the spectrum. However, this pairing was easily established by elimination (due the fact the three disulfide pairings were then known) and later during preliminary structure calculations, since considerable long-range NOE connectivities were also observed among the protons of their neighbour residues (Cys41, Phe43, Thr22, Lys21). Structure description The structure of nawaprin in solution was characterized by the presence of both well- and poorly-defined regions, the extents of which were comparable in magnitude. Figure 4A shows the ensemble of the best 20 structures superimposed over the backbone atoms of the ‘well-defined’ residues, 2-8, 22-38, and 44-51 of the mean structure. It is clear that although large sections of the nawaprin molecule were well-ordered, some regions were apparently

12

disordered. The mean global backbone root-mean-square deviation (RMSD) with respect to the mean structure was 1.81 Å when all residues were superimposed; this was reduced to 0.32 Å when only the well-defined residues of 2-8, 22-38 and 44-51 were considered (see Table 2). While the N- and C-termini of nawaprin were relatively well-defined, there was substantial disorder in the upper regions defined by residues 9-21, called, the ‘outer loop’, and 39-43, called, the ‘inner loop’; this suggests that these two regions of the molecule have higher flexibility than the rest. This apparent structural disorder reflected in the NMR spectra was mainly caused by the dearth of long-range NOEs that would provide information on ‘connections’ between the two loops. The few NOE connectivities that were observed were those between the protons of Ile19-Cys41 and Lys21-Phe43. Unlike many disulfide cross-linked polypeptides, nawaprin is not compact but is rather flat and disc-like (see Figures 4B and 4C). The backbone configuration is essentially spiral in shape, characterized by outer and inner circular segments that are connected by disulfide bonds. The inner segment incorporates a small twisted antiparallel β-sheet (a β-hairpin) at residues 35-37 and 45-47, while the outer segment is devoid of any defined secondary structures except for some β-turns which are situated at residues 26-29, 27-30, and 31-34. Note that in seven out of the 20 ‘best’ structures in the ensemble, a continuous 310 helix spanning residues 26-30 was found instead of the usual β-turn(s). The three disulfide bridges are clustered together at the ‘base’ of the molecule, anchoring the lower inner loop to the two ends of the outer loop; the fourth disulfide bridge defined by Cys20-Cys41 holds the tips of the two loops together.

13

DISCUSSION We have described the purification and three dimensional structure determination of the first snake-toxin member, nawaprin from N. nigricollis, of a new family of proteins. Nawaprin and other waprins from snake venoms2 are small proteins with ~50 amino acid residues that have a four-disulfide core domain structure, making them members of the WAP family of proteins (Figure 2).

Comparison with elafin and other proteins A DALI algorithm (41) search for similar structures in the PDB databank revealed that the overall fold of nawaprin has significant similarity to that of elafin. This is expected given a ‘respectable’ sequence similarity of 34% but more importantly the basically identical disulfide bonding pattern (Figure 2). The three-dimensional structures of nawaprin and elafin superimposed over the backbones of 47 equivalent residues found by DALI are shown in Figure 5A and 5B. An RMSD of 3.2 Å was obtained when the backbone of residues 2-9,1141,43-50 of nawaprin were superimposed onto the backbone of residues 11-18, 19-49, and 50-57 of elafin. It is clear that the tertiary folds of the two peptides are very similar. Even the locations of the secondary structures, the small antiparallel β-sheet in the inner segment and the small turn or helix in the outer segment, are basically identical (see Figure 5B). Since the tertiary structure of elafin is similar to those of other protease inhibitors such as the carboxy terminal half of human seminal plasma inhibitor HUSI-1 (42), also known as SLPI, and SPAI-1 (43), it may be regarded that nawaprin belongs to the same class of enzyme inhibitor proteins that have an equivalent tertiary fold. There are only a few subtle differences in the secondary structures of nawaprin and elafin: In nawaprin, the two β-strands are composed of three residues, 35-37 and 45-47, and it is a residue shorter than that of elafin. This perceived difference in their structures may not be real and may actually be due to uncertainties brought about by analyzing the spectra of nawaprin. In the elafin structure determined by NMR (12), the tip of the outer segment, which is the binding site shows some apparent disorder suggesting higher mobility, while the entire inner loop, containing the twisted β-sheet, is well-defined. Nawaprin appears to be more flexible than elafin as shown by the large degree of apparent disorder in the tips of the

14

outer and inner loops. Such characteristic disorder at the tip of the inner loop of nawaprin may be due to the fact this inner loop is tethered by the Cys20-Cys41 disulfide bond to a very disordered outer loop. There is a possibility that this large continuous region has higher mobility as evidenced by the lack of long-range NOE connectivities in this part of the nawaprin molecule. NMR relaxation experiments could be used to probe the mobility of this region in the molecule. Functional Implications The tip of the outer circular segment in elafin, defined by residues 20-26, is important for its activity, as it is the primary binding segment that interacts with active enzymatic pocket in porcine pancreatic elastase (PPE) (13). This binding segment in elafin is composed of at least seven residues, LIRCAML (boxed parts of loops 1 and 2 in Figure 2C), six of which are hydrophobic; the presence of several hydrophobic residues in this region is known to be crucial for the activities of elafin and other proteinase inhibitors such as SLPI (12). This region also incorporates a disulfide bond that connects the outer segment to the inner core of the inhibitor. In the PPE-elafin complex (13), the primary binding loop (outer loop) in elafin is actually in an extended β-strand conformation, forming an antiparallel β-sheet with PPE through a series of hydrogen bonds. In free solution, however, this outer loop segment is disordered (12). Nawaprin in solution also has an apparently disordered outer loop segment similar to that in elafin. However, based on the sequence alignment in Fig. 2C alone, nawaprin does not have a fragment analogous to primary binding segment defined by residues 20-26. DALI algorithm finds that this primary binding segment in elafin, which is composed of LIRCAML, is topologically similar to the segment defined by residues 12-18 in nawaprin, which is composed of MPIPPLG. Although these two segments are both hydrophobic, they are not sequentially similar to each other. Furthermore, the relative positions of the cysteine pairs that connects the tips of the outer and inner loops are also different in the two molecules. Figures 5C and 5D show the electrostatic potential surfaces of nawaprin and elafin. One can clearly see that the charge distributions in the two molecules are different. Although the upper halves of two molecules, which incorporate the inner and outer loops, contain a number hydrophobic residues, one side of the nawaprin has a more hydrophobic upper part

15

than the corresponding region in elafin. In addition to this, there is large continuous negative patch in nawaprin defined by Glu2, Asp9, Asp27 and the C-terminus Pro51 which is absent in elafin. In fact, part of this region in elafin is positively charged as it includes two lysine residues, Lys12 and Lys43. The difference in the nature of the side-chains of the two molecules therefore suggests that nawaprin may not be a protease inhibitor, although its overall fold is very similar to that of elafin. Given the modest sequence similarity of 34% and the fact that both nawaprin and elafin incorporates several proline residues (including two consecutive proline residues in the external segments), these strongly suggest that these two polypeptides have evolved from a common ancestral molecule. Physiological role of WAPs and related proteins WAP is a major protein constituent in whey and it is suggested to be the major food source for the young (44). Its secretion varies with different phases of lactation (44) and it possibly plays a role in hair and nail growth (45). Although the physiological functions of epididymal proteins such as HE4, CE4 and BE20 (30, 46, 47) in the male genital tract are not clear, they may act as decapitation factors that bind to sperm that are released in the female reproductive tract (48). Proteinase inhibitors of the WAP family play an important physiological role in regulating the activity of various proteinases.

Generally, these inhibitors prevent the invasion of

bacteria and other microbes. In addition, some of them play a specific role in host defense. For example, SLPI and elastase maintain the balance between proepithelin and epithelins and thus regulate innate immunity and wound healing (49). SLPI also acts as a potent antimicrobial agent which is a function that appears to be independent of its anti-proteinase activity (50). Mouse SWAM1 and SWAM2 have potent antibacterial activity, but they fail to inhibit elastase or cathepsin G (51). A caltrin-like protein secreted by guinea pig seminal vesicles inhibits Ca2+ uptake by spermatozoa (52). SPAI-1 inhibits Na+,K+-ATPase (36), but not proteinase (53). Therefore based on their occurrence in snake venoms, waprins may play a part in the offensive armamentarium of this complex mixture of proteins.

16

Accelerated evolution of WAPs and related proteins The structures of complexes of SLPI and α−chymotrypsin (42) and elafin and pancreatic elastase (13) show that outer segments TYGQCLML and LIRCAML (boxed parts of loops 1 and 2 in Figure 2C) are the primary binding segments of these inhibitors to their respective proteinases. In both complexes the sessile bonds (LM and AM, respectively), that determine the proteinase specificity, are intact. A comparison of amino acid sequences indicates that the loop 2 region is the most variable one in WAPs (54-56). This reflects on the ability of WAPs to target not only different proteinases, but also other enzymes (for example, Na+,K+ATPase), receptors, or ion channels. The loop 1 and loop 2 of nawaprin are distinctly different from all other WAPs in both the number as well as chemical nature of the constituent residues. Thus we believe that it will not be a proteinase inhibitor, but a ligand for a specific enzyme/receptor/ion channel. Similar variation of functionalities in the loops within the same structural fold is seen also in other proteinase inhibitors (57) as well as toxins (9). It is also important to note that in mini proteins other surface loops could as also play a role in the ‘bait region’ or ‘functional site’ (9). In nature, this diversification could be achieved through gene duplication and accelerated evolution of WAP genes. Recent studies have shown that a locus on human chromosome 20 contains 14 genes encoding WAPs, and related proteins, suggesting the evolution of WAP gene(s) by repeated duplications (56). Further, the region in exon 2 encoding the reactive site shows only 60-77% nucleotide identity compared to 97-98% identity in other regions (54). This suggests accelerated evolution of WAP genes. In a similar fashion to the evolution of WAP proteins, several molecular scaffolds have been used during the evolution of ‘cocktails’ of the toxins in snake venoms. The selected genes are duplicated several times and the core of each protein scaffold is conserved while the loops and surfaces are altered through mutations. As in the case of WAPs, some exons of the toxin genes of snake venom mutate more rapidly than their introns, thus speeding up the generation of new toxins (58). Therefore, members of a protein family share structurally important core residues including cysteine residues. However, the intercysteine loops show considerable differences in the sequence. This results in toxins with distinctly different molecular surfaces and hence different ability to interact with target receptor/acceptor proteins. Hence they display differences in their biological properties (9).

17

So far, we have been able to isolate only a single waprin from each snake venom. However, three waprins showed a conserved molecular framework with significantly different intercysteine loops. It will be interesting to search for other snake venom proteins with the WAP motif. In summary, we have isolated and characterized a new structural family of snake venom proteins, waprins. They contain a four-disulfide core structure and they resemble the WAP structural fold. Furthermore, analysis of the structures indicates that waprins could have a range of different biological properties.

18

REFERENCES 1.

Bailey, G. S. (1998) Enzymes from Snake Venom, Alaken, Fort Collins, CO

2.

Kini, R. M. (1997) Venom Phospholipase A2 Enzymes: Structure, Function and Mechanism, John Wiley and Sons, Chichester, UK

3.

Harvey, A. L. (1991) Snake Toxins, Pergamon Press, New York, NY

4.

Tu, A. T. (1991) Reptile Venoms and Toxins, Marcel Decker, New York, NY

5.

Braganca, B. M., and Sambray, Y. M. (1967) Nature 216, 1210-1211

6.

Takasaki, C., Suzuki, J., and Tamiya, N. (1990) Toxicon 28, 319-327

7.

Ogawa, T., Oda, N., Nakashima, K., Sasaki, H., Hattori, M., Sakaki, Y., Kihara, H., and Ohno, M. (1992) Proc. Natl. Acad. Sci., U. S. A. 89, 85578561

8.

Mebs, D., and Claus, I. (1991) in Snake Toxins (Harvey, A. L., ed) pp. 425– 447, Pergamon Press, New York, NY

9.

Kini, R. M. (2002) Clin. Exp. Pharmacol. Physiol. 29, 815-822

10.

McLane, M. A., Marcinkiewicz, C., Vijay-Kumar, S., Wierzbicka-Patynowski, I., and Niewiarowski, S. (1998) Proc. Soc. Exp. Biol. Med. 219, 109-119

11.

Yamazaki, Y., Hyodo, F., and Morita, T. (2003) Arch. Biochem. Biophys. 412, 133-141

12.

Francart, C., Dauchez, M., Alix, A.J., and Lippens, G. (1997) J. Mol. Biol. 268, 666-677.

19

13.

Tsunemi, M., Matsuura, Y., Sakakibara, S., and Katsube, Y. (1996) Biochemistry 35, 11570-11576

14.

Schalkwijk, J., Chang, A., Janssen, P., De Jongh, G. J., and Mier, P. D. (1990) Brit. J. Dermatol. 122, 631-641

15.

Wiedow, O., Schroder, J. M., Gregory, H., Young, J. A., and Christophers, E. (1990) J. Biol. Chem. 265, 14791-14795

16.

Joseph, J. S., Chung, M. C. M., Jeyaseelan, K., and Kini, R. M. (1999) Blood 94, 621-631

17.

Inglis, A.S. (1983) Meth. Enzymol. 91, 324 – 332

18.

Marion, D., and Wuthrich, K. (1983) Biochem. Biophys. Res. Commun. 113, 967-974

19.

Rance, M., Sorensen, O. W., Bodenhausen, G., Wagner, G., Ernst, R. R., and Wuthrich, K. (1983) Biochem. Biophys. Res. Commun. 117, 479-485

20.

Derome, A. E., and Williamson, M. P. (1990) J. Magn. Reson. 88, 177-185

21.

Bax, A., and Davis, D. G. (1985) J. Magn. Reson. 65, 355-360

22.

Kumar, A., Ernst, R. R., and Wuthrich, K. (1980) Biochem. Biophys. Res. Commun. 95, 1-6

23.

Piotto, M., Saudek, V., and Sklenar, V. (1992) J. Biomol. NMR 2, 661-665

24.

Bartels, C., Xia, T. H., Billeter, M., Guntert, P., and Wuthrich, K. (1995) J. Biomol. NMR 6, 1-10

20

25.

Brunger, A. T., Adams, P. D., Clore, G. M., DeLano, W. L., Gros, P., GrosseKunstleve, R. W., Jiang, J. S., Kuszewski, J., Nilges, M., Pannu, N. S., Read, R. J., Rice, L. M., Simonson, T., and Warren, G.L. (1998) Acta Crystallogr. Section D 54, 905-921

26.

Koradi, R., Billeter, M., and Wuthrich, K. (1996) J. Mol. Graphics 14, 51-55

27.

Henninghausen, L. G., and Sippel, A. E. (1982) Nucleic Acids Res. 10, 26772684

28.

Campbell, S. M., Rosen, J. M., Henninghausen, L. G., Strech-Jurk, U., and Sippel, A. E. (1984) Nucleic Acids Res. 12, 8685-8697

29.

Beg, O. U., Von Bahr-Lindstrom, H., Zaidi, Z. H., and Jornvall, H. (1986) Eur. J. Biochem. 159, 195-201

30.

Kirchhoff, C., Oster-hoff, C., Habben, I., and Well, R. (1990) Int. J. Androl. 13, 155-167

31.

Zeeuwen, P. I. J. M., Hendriks, W., de Jong, W. W., and Schalkwijk, J. (1997) J. Biol. Chem. 272, 20471-20478

32.

Furukawa, M., Suzuki, Y., Ghoneim, M. A., Tachibana, S., and Hirose, S. (1996) J. Biol. Chem. 271, 29517-29520

33.

Schalkwijk, J., Wiedow, O., and Hirose, S. (1999) Biochem. J. 340, 569-577

34.

Nara, K., Ito, S., Ito, T., Suzuki, Y., Ghoneim, M. A., Tachibana, S., and Hirose, S. (1994) J. Biochem. 115, 441-448

35.

Schalkwijk, J., Chang, A., Janssen, P., de Jong, G. J., and Mier, P. D. (1990) Brit. J. Dermatol. 122, 631-641

21

36.

Araki, K., Kuroki, J., Ito, O., Kuwada, M., and Tachibana, S. (1989) Biochem. Biophys. Res. Commun. 164, 496–502

37.

Stetler, G., Brewer, M. T., and Thompson, R. C. (1986) Nucleic Acids Res. 14, 7883-7896

38.

Si-Tahar, M., Merlin, D., Sitaraman, S., and Madara,J. L. (2000) Gastroenterology 118, 1061-1071

39.

Eisenberg, S. P., Hale, K. K., Heimdal, P., and Thompson, R. C. (1988) Biol. Chem. Hoppe-Seyler 369 Suppl, 79-82

40.

Wuthrich, K. (1986) NMR of Proteins and Nucleic Acids, Wiley, New York, NY

41.

Holm, L., and Sander, C. (1993) J. Mol. Biol. 233, 123-138

42.

Grutter, M. G., Fendrich, G., Huber, R., and Bode, W. (1988) EMBO J. 7, 345351

43.

Kozaki, T., Kawakami, Y., Tachibana, S., Hatanak, H., and Inagaki, F. (1994) Pep. Chem. 405-408

44.

Simpson, K. J., Ranganathan, S., Fisher, J. A., Janssens, P. A., Shaw, D. C., and Nicholas, K. R. (2000) J. Biol. Chem. 275, 23074-23081

45.

Renfree, M. B., Meier, P., Teng, C., and Battaglia, F. C. (1981) Biol. Neonate 40, 29-37

46.

Ellerbrock, K., Pera, I., Hartung, S., and Ivell, R. (1994) Int. J. Androl. 17, 314323

47.

Fan, H.-Y., Miao, S.-Y., Wang, L.-F., Koide, S. S. (1999) Arch. Androl. 42, 6369

22

48.

Kirchhoff, C., Habben, I., Ivell, R., and Krull, N. (1991) Biol. Reprod. 45, 350357

49.

Zhu, J., Nathan, C., Jin, W., Sim, D., Ashcroft, G. S., Wahl, S. M., Lacomis, L., Erdjument-Bromage, H., Tempst, P., Wright, C. D., and Ding, A. (2002) Cell 111, 867-878

50.

Tomee, J. F. C., Koeter, G. H., Hiemstra, P. S., and Kauffman, H. F. (1998) Thorax 53, 114-116

51.

Hagiwara, K., Kikuchi, T., Endo, Y., Huqun, Usui, K., Takahashi, M., Shibata, N., Kusakabe, T., Xin, H., Hoshi, S., Miki, M., Inooka, N., Tokue, Y., and Nukiwa, T. (2003) J. Immunol. 170, 1973-1979

52.

Coronel, C. E., San Agustin, J., and Lardy, H. A. (1990) J. Biol. Chem. 265, 6854-6859

53.

Araki, K., Kuwada, M., Ito, O., Kuroki, J., and Tachibana, S. (1990) Biochem. Biophys. Res. Commun. 172, 42-46

54.

Tamechika, I., Itakura, M., Saruta, Y., Furakawa, M., Kato, A., Tachibana, S., and Hirose, S. (1996) J. Biol. Chem. 271, 7012-7018

55.

Furutani, Y., Kato, A., Yasue, H., Alexander, L. J., Beattie, C. W., and Hirose, S. (1998) J. Biochem. 124, 491-502

56.

Clauss, A., Lilja, H., and Lundwall, A. (2002) Biochem. J. 368, 233-242

57.

Hill, R. E., and Hastie, N. D. (1987) Nature 326, 96–99

58.

Nakashima, K., Ogawa, T., Oda, N., Hattori, M., Sakaki, Y., Kihara, H., and Ohno, M. (1993) Proc. Natl. Acad. Sci. U. S. A. 90, 5964–5968

23

FOOTNOTES * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. This work was supported by research grants from the Defense Science and Technology Agency of Singapore, Academic Research Grants and the Australian Research Council. The atomic coordinates for the solution structure of this protein are deposited in Protein data Bank of the Research Collaboratory for Structural Bioinformatics under the accession code 1UDK.

1

The abbreviations used are: ACN, acetonitrile; ESI-MS, electrospray ionization mass

spectrometry; FPLC, fast protein liquid chromatography; HUSI-1, human seminal plasma inhibitor; MALDI-TOF MS, matrix assisted laser desorption/ionization time of flight mass spectrometry; PPE, porcine pancreatic elastase; RMSD, root-mean-square deviation; RPHPLC, reverse phase high performance liquid chromatography; SLPI, secretory leukocyte proteinase inhibitor; SPAI, Na+,K+-ATPase inhibitor; TFA, trifluoroacetic acid; WAP, whey acidic protein. 2

Fry, B. G., Hock, S. T. and Kini, R. M., unpublished observations

24

FIGURE LEGENDS Figure 1. Purification of nawaprin. (A) Gel-filtration chromatography of Naja nigricollis venom (150 mg) on a Superdex 30 FPLC column (1.6/60) column. The column was equilibrated and eluted with 50 mM Tris/HCl buffer, pH 7.4, at a flow rate of 1.5 ml/min. The horizontal solid bar indicates the fractions containing protein with the mass of ~5290 that were pooled. (B) The peak (indicated by the horizontal bar) from gel-filtration chromatography was further separated on a Uno S column. The column was equilibrated with 50 mM Tris/HCl buffer, pH 7.4, at a flow rate of 2 ml/min. The bound proteins were eluted by a linear gradient of buffer B containing 1 M NaCl. (C) The unbound fractions (shown by the horizontal bar) from the cation exchange chromatography was purified on a reverse-phase Jupiter C18 (1 cm x 25 cm) column equilibrated with 0.1% (v/v) TFA, and the bound proteins were eluted with a linear gradient of 80% (v/v) acetonitrile in 0.1% (v/v) TFA at a flow rate of 2 ml/min. The peak containing nawaprin is indicated. (D) Electrosprayionisation MS of nawaprin. The spectrum shows ions with three and four charges, corresponding to a single, homogeneous peptide of molecular mass 5288.5 (inset). Figure 2. Amino acid sequence and structural similarity of nawaprin with WAPs. (A) Determination of amino acid sequence of nawaprin. The complete amino acid sequence was determined by amino terminal sequencing of the native, pyridylethylated protein and peptides obtained by formic acid and trypsin digest. (B) Amino acid sequence of waprin is aligned with epididymal proteins of WAP family to show maximum similarity. Identical residues are shown in black boxes while conserved residues are in shaded boxes.

(C)

Structural

similarity of waprin with functionally characterized WAPs. The disulphide-bonding pattern is shown by the lines connecting corresponding cysteine pairs. Identical residues are shown in black boxes while conserved residues are in shaded boxes.

Intercysteine loops are

numbered and the ‘primary’ binding segments in elafin and SLPI are shown in horizontal boxes. Figure 3. Fingerprint region of the NOESY spectrum of Nawaprin at 25 ºC and pH 3.1. Shown are the sequential NH-CαH connectivities for residues 26-30, 31-39 and 45-48. Intraresidue connectivities are labelled with their corresponding residue number.

25

Figure 4. Nawaprin structure. (A) Ensemble of the ‘best’ 20 Nawaprin structures superimposed to show the best fit over backbone the atoms N, C, Cα of residues 2-8, 22-38 and 44-51 of the mean structure. (B) and (C) Ribbon diagram of the structure closest to the mean showing secondary structures and disulfide connectivities (in yellow). The two views are related by ~90º rotation about the vertical axis. The figures were generated using MOLMOL (26). Figure 5. Comparison of nawaprin and elafin structures. (A) Nawaprin and elafin structures superimposed over the backbone atoms of residues 2-9, 11-41, and 43-50 of nawaprin, and residues 11-18, 19-49, and 50-57 of elafin. Nawaprin and elafin (1fle) are shown in black and light grey respectively. The first 10 residues in elafin are not shown. (B) same as in (A) but drawn as ribbon diagram. (C) Molecular surface of nawaprin and (D) elafin highlighted to show electrostatic potential. Surfaces with positive, negative and neutral electrostatic potentials are drawn in blue, red and white respectively. The two views are related by 180° rotation about the ‘virtual’ vertical axis. The brackets in (D) indicate the primary binding site in elafin.

26

Table 1 Theoretical and experimentally determined masses of nawaprin and its Peptides

Native nawaprin Pyridyethylated nawaprin Formic acid peptide F1 (28-51) Formic acid peptide F2 (10-51) Tryptic peptide T3 (39-51)

Molecular Mass ------------------------------------------------------------Calculated† Observed‡ 5288.21 5288.5 ± 0.08 6137.37 6737.5 ± 0.2 3001.66 3002.5 ± 0.8 5114.28 5113.5 ± 1.1 1537.84 1535.5 ± 0.8

† Masses were calculated from the amino acid sequences ‡ Masses were determined by ESI/MS or MALDI-TOF

27

Table 2 Structural statistics for the ensemble of 20 nawaprin structures Quantity

Value

Distance restraints intraresidue (i–j = 0)

206

sequential (|i–j| = 1)

143

medium-range (|i–j| ≤ 5)

24

long-range (|i–j| > 5)

130

hydrogen-bonds

18

total

521

Dihedral-angle restraints φ

9

Atomic RMSD with the mean structure (Å) backbone atoms (1-51)

1.81± 0.58

heavy atoms (1-51)

2.23 ± 0.48

backbone atoms (2-8, 22-38, 44-51)

0.32 ± 0.07

heavy atoms (2-8, 22-38, 44-51)

0.68 ± 0.11

28

1.4

A

0 0

40

80

110 100

0.10

B 50

0.05

0

0 0.4

0

15

C

30

100

Nawaprin

0.2

50

0

0 0

20

40

Time (min) 1323

1764

D

cps

7000

5288.5

3e5 2e5 1e5

Counts/sec

0 3000

3500

0 1000

1500

m/z

Figure 1

6000

Mass

2000

9000

B%

Absorbance (280 nm)

0.7

A Nawaprin Native protein Pyridylethylated protein Formic acid F1 Formic acid F2 Tryptic peptide T3

NEKSGSCPDMSMPIPPLGICKTLCNSDSGCPNVQKCCKNGCGFMTCTTPVP NEKSGSXPDMSMPIPPLGIXKTLXNSDSGXPNVQ NEKSGSCPDMSMPIPPLGICKTL SGCPNVQKCCKNGCGFMTCTXPV MSMPIPPLGICKTLCNSDSGCPNVQKCCKNGCGFMTCTX NGCGFMTCTTPVP

B Nawaprin NEKSGSCPDMSMPIPPLGICKTLCNSDSGCPNVQKCCKNGCGFMTCTTPVP Man HE4 NDKEGSCPQVNINFPQLGLCRDQCQVDTQCPGQMKCCRNGCGKVSCTVP-Dog CE4 (Domain 2) NEKEGSCPQVNTDFPQLGLCQDQCQVDSHCPGLLKCCYNGCGKVSCVTPIRabbit BE20 (Domain 2) NEKEGSCP--SIDFPQLGICQDLCQVDSQCPGKMKCCLNGCGKVSCVTP-Dog CE4 (Domain 1) -EKTGVCPQLQADLN----CTQECVSDAQCADNLKCCQAGCATI-CHLP-Rabbit BE20 (Domain 1) -DKPGVCPQLSADLN----CTQDCRADQDCAENLKCCRAGCSAI-CSIP-Tam WAP (Domain 1) -EKAGYCPDFRQVLLDRRDCKQLCNDDASCPQNMRCCQRGCSWL-C-----

C Nawaprin Elafin (SKALP) SLPI (Domain 1) SPAI-2 SLPI (Domain 2)

Figure 2

Loop 1

Loop 2

Loop 3

Loop 4

Loop 5 Loop 6

NEKSGSCPDMSMPIPPLGICKTL-----CNSDSGCPNVQKCCKNGCGFMTCTTPVP -TKPGSCPIILI------RCAMLNPPNRCLKDTDCPGIKKCCEGSCG-MACFVPQ--KAGVCPPKKS-----AQCLRYKKP-ECQSDWQCPGKKRCCPDTCG-IKCLDPVLSKRGHCPRILF------RCPLSNPSNKCWRDYDCPGVKKCCEGFCG-KDCLYPK--KPGKCPVTY------GQCLMLNPPNFCEMDGQCKRDLKCCMGMCG-KSCVSPV-

34

48 27

28

4.00

32 38

35

29

ω1 (ppm)

33 47

26 36

5.00

39 37 46 30

10.00

9.50

9.00

8.50

ω2 (ppm)

8.00

7.50

B

A

C 17

17 42

43 39

10

13 23 25 31

C

N

C

N

N

32

C

Suggest Documents