celi adhesion)

Proc. Natl. Acad. Sci. USA Vol. 90, pp. 10783-10787, November 1993 Developmental Biology The precursor region of a protein active in sperm-egg fusion...
12 downloads 2 Views 2MB Size
Proc. Natl. Acad. Sci. USA Vol. 90, pp. 10783-10787, November 1993 Developmental Biology

The precursor region of a protein active in sperm-egg fusion contains a metalloprotease and a disintegrin domain: Structural, functional, and evolutionary implications (PH-30/spermatogenesis/snake venom/astacin/celi adhesion)

TYRA G. WOLFSBERGt4, J. FERNANDO BAZANt§, CARL P. BLOBELt$¶, DIANA G. MYLESII, PAUL PRIMAKOFFII, AND JUDITH M. WHITEtt** Departments of tPharmacology and tBiochemistry and Biophysics, University of California, San Francisco, CA 94143; and University of Connecticut Health Center, Farmington, CT 06030

IlDepartment of Physiology,

Communicated by Bruce M. Alberts, August 4, 1993

ABSTRACT PH-30, a sperm surface protein involved in sperm-egg fusion, is composed of two subunits, a and (3, which are synthesized as precursors and processed, during sperm development, to yield the mature forms. The mature PH-30 a/fl complex resembles certain viral fusion proteins in membrane topology and predicted binding and fusion functions. Furthermore, the mature subunits are similar in sequence to each other and to a family of disintegrin domain-containing snake venom proteins. We report here the sequences of the PH-30 a and (3 precursor regions. Their domain organizations are similar to each other and to precursors of snake venom metalloproteases and disintegrins. The a precursor region contains, from amino to carboxyl terminus, pro, metalloprotease, and disintegrin domains. The ,B precursor region contains pro and metafloprotease domains. Residues diagnostic of a catalytically active metalloprotease are present in the a, but not the ,B, precursor region. We propose that the active sites of the PH-30 a and snake venom metalloproteases are structurally similar to that of astacin. PH-30, acting through its metalloprotease and/or disintegrin domains, could be involved in sperm development as well as sperm-egg binding and fusion. Phylogenetic analysis indicates that PH-30 stems from a multidomain ancestral protein.

PH-30, a guinea pig sperm surface protein, is a candidate sperm-egg membrane binding and fusion protein (1-5). The PH-30 subunits found on fertilization-competent sperm, mature a and mature (, share membrane topologies and other characteristics with viral binding and fusion proteins. The (3 subunit contains a potential receptor binding domain, a disintegrin domain, related to soluble integrin ligands found in snake venom. The a subunit contains a potential fusion peptide. In addition, the two subunits share sequence similarity. Snake venom disintegrins derive from precursors that also contain zinc-dependent metalloprotease domains (6, 7). Interestingly, PH-30 a and , are present on testicular spermatogenic cells as larger precursors, termed here pro-a and pro-,8 (2). Here we show that the precursor regions of PH-30 a and 3 (the regions amino-terminal to the mature proteins and found on developing, but not fertilization-competent, sperm) share further amino acid identity with each other as well as with this family of metalloprotease and disintegrin domain-containing snake venom proteins.tt

MATERIALS AND METHODS Cloning. A portion of the a precursor region sequence was obtained from a PCR product generated by the nested RACE (rapid amplification of cDNA ends) protocol (3). The se-

quences of the remainder of the a precursor region and the entire ( precursor region were determined from clones of a and (8 isolated at high stringency (3) from a guinea pig whole-testis cDNA library (8). Northern Analysis. RNA was isolated from adult male guinea pig tissues (9), electrophoresed in a formaldehyde/ agarose gel, and transferred and cross-linked to a Hybond-N nylon membrane (Amersham). High-stringency prehybridization and hybridization with -PH-30 a and 38 32P-labeled DNA probes was carried out at 65°C in 5x standard saline citrate (SSC)/5x Denhardt's solution/0.1% SDS containing salmon sperm DNA at 0.2 mg/ml. The membrane was washed, 10 min per wash, in 2 x SSC/0. 1% SDS once at room temperature and twice at 65°C and then in 0.2x SSC/0.1% SDS twice at 65°C. Hybridization with a mouse f-actin probe was carried out in the same solution at 55°C, with identical wash conditions.

RESULTS AND DISCUSSION The amino acid sequences of the PH-30 a and (3 precursor regions were deduced from cDNA sequences and are shown in Fig. 1. Following their signal sequences, the a and P precursor regions contain sequences similar to those in the prodomains of disintegrin domain-containing snake venom proteins, and then sequences which align with the snake venom zinc-dependent metalloprotease domain (Figs. 1 and 2). a contains the consensus active-site residues for a metalloprotease (see below); (3 does not. Following the metalloprotease domain, both proteins contain a disintegrin domain (Figs. 1 and 2). The cleavage site which generates mature a falls within the disintegrin domain (3) (arrows, Figs. 1 and 2). The cleavage site which generates mature , lies at the amino terminus of the disintegrin domain (3) (arrow before position 383, Figs. 1 and 2). The sequence alignment of mature PH-30 a and 8 with the snake venom proteins continues through the cysteine-rich domain (Figs. 1 and 2). No snake venom proteins include either the epidermal growth factor repeat or the transmembrane and cytoplasmic segments of a and (3(Figs. 1 and 2). Additional mammalian genes encode proteins with domain organizations identical to those of PH-30 a and 8 (Figs. 1 and 2). EAP I, cloned from rat and monkey, is an androgen-regulated protein located on the apical surface of epididymal epithelial cells (13). Cyritestin is a mouse testis cDNA (GenBank accession no. X64227).

§Present address: Department of Molecular Biology, DNAX, Palo Alto, CA 94304. lPresent address: Department of Cellular Biochemistry and Biophysics, Sloan-Kettering Institute, New York, NY 10021. **To whom reprint requests should be addressed. ttThe sequences reported in this paper have been deposited in the GenBank data base (accession nos. Z11719 and Z11720).

The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact.

10783

Developmental Biology: Wolfsberg

10784 PH 3I a PH 30

Cyritestoon EA;P I Ht-e

TrigraaS4n PH 30 aE PH-30

H E K K

Cyritestin KAP I

Jararhagin

RDEVDDP

YK VYP IPY E

FYIEKMD

R

Y

PH830 at

AC S ISP IL A I I I L R 0 L L F F T I I 0 H I K Q101N A A ElI FY TVT LIT C I L ROCL L TO IISfH E F It' ~A I QG CNIGLIR GIF F K H D A D ST ASI S ACN GL K 0 Y F IN NOGR OKE N D A DS T A I I I AINOG L F O H F N D ADS TA HOE-IR IACDIG CFK OH F

PH 20

Jararhagmn HI-e

Trigramin

PV

V

EW

Cyritestin

DC..

IHL Y1-HHN DKTI OAR H Y Y SIX ElI V IP E S L T V K 0 S Q -D POO R TOSY ML LI QWH K DO - K IV OKE N T T YII V N L V R K H F L P H D F Q VY SWO I AIoI H K P F K DY S Q N F C IYI NOON QIT V P K K I RS V K K G - -S V K S E VI Q r IT V F E K ID TOIl IQD AF KE AKE T Q V T Y V V T I EGKOA Y T L T PL sIFIL H P L F 0 T Y L R D K LWT L Q P Y F IL V K T H: C FlY LKE K I EK KY K K K L L YE I K L -OGR K T L T L H L L P A K KIFIL A L N Y I K T YIYN IK K K MV T K H P QI L D H :C FFll L S L L Q Y R D L K K I H CS D T P -K Y EDLA M QY E FYV NOKEPIV% L HLEFKN KOGLFISKODlIE IHY SP DOR EOIT TYP P VKEDHCYYI A TR P KOA V IP

Hr a Triaramir KAFO

Proc. Natl. Acad. Sci. USA 90 (1993)

at.

78 22 22 33 30 30

V P N T I S F S A S C Q K A H V V L H V A R S L L Q T C T L L M V A P R L 0 L V P 0 H L C V R L V T K LL V 0 T V L L P H IOH C H C G P V M LC LLL L LCOCL ACO1OGP.L KK YV M A A ALF FL; L 0 Y CD GQVI AtA OK D V M F P T 0 I F L H $ V L I S QM H 0G KG I:V G VE t, Q K L V H P K MI QV LCLV T I CLAAF PY QOG'SSISILE SOGN VN D PP Y QOGVI SOIIL E S ON C N D M I Q V L COIT I C L A VF

N R S 0 S M M A ~'

P

et

V

T A L

P0 V

-

A V

P K

0~~~J

K Y

TOCADG

Q

YK

YMK

A

F

OKE V, C_

P NN-0E V

G

P V VL H L

K

N K

IE &

VLIY-P V I

KILKLIWISQP

-V T TIS Y Q FK EN

H K 0 O I K P SIS I 0 F K H V Q L EN I21T1Y1K F PL E I IIA T F E HOI R JNDIQRHYL OKE V K YWSD K 0 DIHI L K LIQR E T YF OK FL F L F DOSE AJHA E LG OMNYCL I K F L K L D S K A H A EL 0KEM SCL I K P C E LCWD S K A H A

-10£E

K L

I

D

Y

I A K K K

I T A

IP I V K

D H

C YM lj

HIDQ

225 A K D SQ A V S S I N V K N V V Y K I K TO I I ---171 F L KEKN F A N I K 1 0 LIE 1716 11Y1K Q T N ISOE 0 L N K T K K I T C I D A K 0 086 A PfrMHfGJV T QNWEK S Y E POIXKE A S 044 183 A P KM C G V T QN W El SEP IKEK A S P PWM CJG V T IN W El I K S T K K A S 183

A A P K H D T N K K S I Y

LY EIOKNNKKID Y V F K I N VK A P Y I FFKY E NV E K ED I F K LK N I K K K D I F K I E NV E KE D

I D OKKE VSERA

I H Y S

RD

152 96 98 109 69

CHK

Metalloprotease Domain

KHn,VWHE

EV

00 P P: P H S V~Q A C I I I CV..N T F V VH N00 IRK I H WDGSDV Ol NE TIDW V D II A L A Nl HO ILLE N LIP H MOO N T AT V T EK IFO - -P0 Q L VOCL N N A ---------KSS V P T H WO K P EK GOS ----------N S T L T KF PILE OIK ION D F A M F O HM 0 OVGVA T QK V V HOPF LOI NIT 0 K K H F V K 3 H K K K F F V VWADIE F V; S R K N I K P Q N K C R KI H0WGMVIN P VIHIH I RIDr P YK KYI F P V V V 000 T VT K NNOG DCL D K OKX A R H YE L AN I VWKE ICL A FPT A -K--

PH-3S ax P9-30 p Cyritestin

NL~ oW'L

EAP 0

Jararhagin

TrigraRin

Q C N

PH-30 aE

WCHIE V P V OR V IIN F1N RERI D FHLL

T P K -'---

2 Q H F P Q RIOtI C 0G P- V 0 H 0 M S T K 1 0 0 N S K H I T K RIV H I H I N N

PRN

OWFN

F TED I N T EHVIH A 0 V EWVHT EDG F FST TWN L T VOA I LA L WID E H FQ I ILI M T VMLN N L EIIWOI I I IYKA ILIN 0 K V T L T 0 M EIKWII A 0 OFR K SYHMHWIVIA VOGL CIKWIN 0 H I K

AWF]N

R H DVWAH M I 'O GH H P0-KE T SWIWQF LCNO AtC SlOPC A A A 1KE OD E E I VC V RSPI M:IC. N TISOYODOG A 10 D ElI E T NO GD A DKV 11F QRF C LIEKS F £011 K-AID I T Y CLILL K I H P-DY VIOlA TY HICIM A:C N F N F TACOI A A SF00G OI::T LEIS'C7 011 0K W LIYT IN QG D ElI ElI V S N C E SIT CILCHIFIS T WQK TICLK K REKD P D NV OIL LC W Q T Y D D T T

PH lI

II

Cyritestin EAP I

Jararhagin

Q

T

CC

LCL

GIYIAI

253

P H H ED A LCCI A A C C H s K TO TL D SF0C V I H P FTLC A V K C F A I V DCC POIVN 01IHENRM

310 320

A C C K Q

Y

FPI

HD N A QI CCITT A I AI FF D 0 F GIRjAKJIG DI HO" MICC-D0PF KK R SS VA0 0I V 0 HH NS A II VN CF IV A V Q

0N

2333

NW1

4 49 38 5 397

-~S P P~ -S TI t

364 398

0T

N T

F

C

W RR

T D

K

K

H

K T V C C N K T 0 H D H A

S

T

I I N 0 N V 0 0 K

V V

D

P

0 0

K 0

K

Disintegrln Domnain

NE[FP]W

DNIDWtjY

L111CC

FPT I S N K P 5 P F F - SN ISIS I DQIW0DF I M N H N F KIll I I WEEPIL 0 TO I I F VOISD K F S K I F - L1K K TODT V CE FO N N QK FIJ I C KKX D

A1 >jLOGjjH0H ICSI'CIG D I F - 100Mb T HE C O H N9 COO H NOET O)S CS COOlS-CON

Jararhagin

HI-a

pH39V(

37H

I A KOK H FKA EKER A A V:C ROE H I CL MIQE N 01K ESGF N[EfS DIPS FHWC H K H RDG AWJC K NECOH N COG I END H E C V QLL O I AD N A D C C KEF V I K r9 C: N S P K A VF 5 1 0 N P N F 10 Ni 11V1K A F P CWT 5 1 Q V S Q.C CQ K-C- K 0 A I R 0 LAIY I POGS T IM1lN D011 N C I P S AIRP100IK V FISIIII V OKE FE KC A S IP E L DCL K N TOSE TKEF V VJ Q F CDG0 G~

A S N I N V V I A I A F T 1

f~

PH VI a

PH-30

225

I I T T CLS V C K I W S EKF

327 33 1

0 C

Cyritastin

255

V A V T V A V T H

D

HI-a Trigramin

iV 1 C P K Cj PI0 A

IR

301 234 244

ETj77D A TIN LEO C: CI TICON N K, VIKIDWEIDO0101G0K EIII-.ITIC MSNELO CA C. D K~~~E)ICH) K.::CND A

O

TS

QYKFE

T F

DLI

TF K TK OREV

-

-

C: LP EI-NT -WG4607G K"' PTKE0100F KP NC 4750 52

DECO

K

E

Hyita tin 1-: 0 N- H L C K A 0 I C: 0 'C: 0 S C K N PC. T Ol K TN ICP. DP 0I 2 OA K 0 CP C 0 I - -:: IK F H K K 0 VL I K1WN 0 -K. N 0P 0 F C: T 0 E S A 469 PH-A

PIPCOTS K 1

a

Caroarhagi

DOE

DCGA PO P A A, K C~~~~~UG F A V H

Jararhagon

L

A

Trigramin

DIFKGNPFHLEA

PH-S30

A

Cyritestin

I,

EKCPCW

PH 30 a

A N

PH-OS

P9-32

EAPF

;

D

E

[Li C;0 N

L Z

1 0

N

N

P

T

,C 7-DA

A1

-'

NL7

TGISSIP N A IA1NN Q

H

H

K

L

E

V 0C

.C

IFHTOTC

M

VOIF

ON C

ASK NPE0

I

TI T 1001001

C

T

-

KRE[J

Y Y L V QP,WI INFAFINTEPY0 C 1 K 0 OW

N LPOPONSIPPIIDNNW

T T

D PIDVI

KFKQ

P: C

I I A FF0 A

'NGRSA

D

Q - Q 11111 F T D 1 P P 0 H AK 1. C Q D I F KV YANC K S;PN V ATTO

HIO DEAPOHOCDD S- 12

L

0CIC

Al

G VIOlA I KVINICQNOEIFN1TRC K A: K 0 N 0 1R K K E - - ND0 P 0 QW P K0 K G

I 2' CK A .YIPCAQRGD[L

0 C AW C 0 0 P A K K A P P P F K K K K A 0 I C C C C LH C I I P F K V I F S I C I A T V P V I I I K K K I P S K 0 1 A N 0 El N I

V V V; C- V

KAPIteti

0

G1 3::L A ; El P C Y11 V 0-, P K K I C5 NO F0 V 1 N K A K.C N N E T K KC: G P 0C K A11 A 1R C 0 0

Jararhagin a

0 A TI.~ NE:- PP7EC P11OCT:OFDI THKC 1KPG H OWi-:V ESA 0C C N S 01 K OP OHA T LEFCN:& GTPET DILPWKIFiKOF

0

NIHR 1 TVIGDI P 1 K l- K I INI lKl S S S I C 0 T SHIN H C K LQlDKTiFG NI S GWI -I~ 011 C R VV-

FPPEC;A S

KKKQ

SGT P TAP P F F:.TNITTEAST1KFOCLD

KKKQ

759 5453

546 490

653

61054

645

0

MWV KGHAKSA D G TV -0:K F KE VS F VYI GI6K

P

NLDI SCCK KTWMFLYC

I

V D KG F

NRML---O

V 0

1

Q HWT . A

:GEMV:.,S

D:SC K P E

V P 1

KL

6174 675

Swis-Not (RelAse1- 24) dat bae." Prdmi.P-0aad ae3%ietcl ecuiggp)oerti ein h irtaioai eiu res (udrie) non of whichis68 sHow for a and (His tha of th fis n-frmemehinine a cotin or poenia start mehonn PIduEsTYFVEG -CTPA.T-I

L

followed by an obvious signal sequence (10). The methionine at position 7 is encoded by an AUG codon in the most optimal context for translation initiation (11). The putative start methionine of B(underlined), encoded by an AUG codon in a good context to initiate translation (11), is followed by a potential signal sequence (10). Potential sites for signal-sequence cleavage are marked with arrows. Stars mark cysteine residues which may be involved in a matrix metalloprotease-like "cysteine switch" (12) activation of the metalloprotease domain. Metalloprotease domain. PH-30 a and 8 are 27% identical (excluding gaps) over this region. The consensus snake venom metalloprotease active-site sequence,

Developmental Biology: Wolfsberg et al. SS

.X

P

C

D

5AS

E TMI T

I., I

.

1

XZ

|

PH-30ae II

11

I

I

)-.

,.' \Ze Z'-, t C. .f .,

4.,

0..,

...