Signaling Hypergraphs: A New Representation for Biological Signaling Pathways

Signaling Hypergraphs: A New Representation for Biological Signaling Pathways Anna Ritz CS 6824: Hypergraphs Virginia Tech March 3, 2014 Biological...
Author: Coleen Jennings
0 downloads 0 Views 3MB Size
Signaling Hypergraphs: A New Representation for Biological Signaling Pathways Anna Ritz CS 6824: Hypergraphs Virginia Tech

March 3, 2014

Biological Signaling Pathways

http://en.wikipedia.org/wiki/Central dogma of molecular biology Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

Biological Signaling Pathways

Cellular Signaling Cellular Communication “Message passing” mechanism of the cell Messages pass between proteins to regulate gene transcription

http://en.wikipedia.org/wiki/Central dogma of molecular biology Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

Biological Signaling Pathways

Cellular Signaling Cellular Communication “Message passing” mechanism of the cell Messages pass between proteins to regulate gene transcription Wnt Signaling Pathway Embryonic development Associated with cancer Initiated by the Wnt protein

http://en.wikipedia.org/wiki/Central dogma of molecular biology Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

Outline 1

The Wnt Signaling Pathway Cartoon Representations Database Representations

2

Mathematical Representations of Signaling Pathways Signaling Pathways as Graphs Signaling Pathways as Hypergraphs

3

An Application: The Active Sub-Hypergraph Problem Signaling Hypergraphs Problem Formulation Integer Linear Program EGFR Signaling Results

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

In the Absence of Wnt Signaling

Fzd Cell Surface Axin GSK APC

Anna Ritz (CS 6824) (VT)

B-Catenin

Signaling Hypergraphs

March 3, 2014

In the Absence of Wnt Signaling

Fzd Cell Surface p Axin GSK APC B-Catenin

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

B-Catenin

March 3, 2014

In the Absence of Wnt Signaling

Fzd Cell Surface p Axin GSK APC B-Catenin

B-Catenin p

B-Catenin

Degraded B-Catenin Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

In the Absence of Wnt Signaling

Fzd Cell Surface p Axin GSK APC B-Catenin "Destruction Complex"

B-Catenin p

B-Catenin

Degraded B-Catenin Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

In the Presence of Wnt Signaling Wnt

Wnt

Wnt

Fzd

Wnt Cell Surface

Axin GSK APC

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

B-Catenin

March 3, 2014

In the Presence of Wnt Signaling Wnt

Wnt

Wnt

Fzd Axin GSK

Wnt Cell Surface

B-Catenin

APC

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

In the Presence of Wnt Signaling Wnt

Wnt

Wnt

Fzd Axin GSK

APC

Cell Surface B-Catenin

B-Catenin

B-Catenin B-Catenin

Anna Ritz (CS 6824) (VT)

Wnt

Signaling Hypergraphs

B-Catenin

March 3, 2014

In the Presence of Wnt Signaling Wnt

Wnt

Wnt

Fzd Axin GSK

APC

Wnt Cell Surface

B-Catenin

B-Catenin

B-Catenin B-Catenin

B-Catenin

Regulates Gene Transcription

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

Wnt Signaling

C. Y. Logan and R. Nusse, The Wnt Signaling Pathway in Development and Disease. Annu. Rev. Cell Dev. Biol. 2004.

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

Wnt Signaling

C. Y. Logan and R. Nusse, The Wnt Signaling Pathway in Development and Disease. Annu. Rev. Cell Dev. Biol. 2004. http://www.biolegend.com Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

Wnt Signaling

C. Y. Logan and R. Nusse, The Wnt Signaling Pathway in Development and Disease. Annu. Rev. Cell Dev. Biol. 2004. http://www.biolegend.com



Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

We have lots of data available about these signaling pathways. Can we develop computational methods to analyze them? Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

Outline 1

The Wnt Signaling Pathway Cartoon Representations Database Representations

2

Mathematical Representations of Signaling Pathways Signaling Pathways as Graphs Signaling Pathways as Hypergraphs

3

An Application: The Active Sub-Hypergraph Problem Signaling Hypergraphs Problem Formulation Integer Linear Program EGFR Signaling Results

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

NetPath Signaling Database

Destruction Complex

www.netpath.org/netslim Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

KEGG Signaling Database

www.kegg.jp Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

Reactome Database

www.reactome.org Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

Reactome Database

www.reactome.org Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

Outline 1

The Wnt Signaling Pathway Cartoon Representations Database Representations

2

Mathematical Representations of Signaling Pathways Signaling Pathways as Graphs Signaling Pathways as Hypergraphs

3

An Application: The Active Sub-Hypergraph Problem Signaling Hypergraphs Problem Formulation Integer Linear Program EGFR Signaling Results

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

Graph Representation of Signaling Pathways

Create a directed graph Nodes are proteins Edges are interactions Destruction Complex

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

Graph Representation of Signaling Pathways

Create a directed graph Nodes are proteins Edges are interactions Destruction Complex

Anna Ritz (CS 6824) (VT)

Edge directionality determined by the type of protein interaction

Signaling Hypergraphs

March 3, 2014

Graph Representation of Signaling Pathways

Rec

Rec

Destruction Complex

TR TR

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

TR

TR

March 3, 2014

Graph Representation of the Human Interactome Directed graph constructed from multiple public databases describing all protein interactions in human cells. Graph may be weighted. 11,266 nodes 129,900 edges 16,792 directed interactions 56,554 bidirected interactions

Each edge in the human interactome is supported by the literature.

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

Problem Setup (Graphs)

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

Problem Setup (Graphs)

Rec

Rec

TR TR

Anna Ritz (CS 6824) (VT)

TR

Signaling Hypergraphs

TR

March 3, 2014

Problem Setup (Graphs)

Rec

Rec

TR TR

Anna Ritz (CS 6824) (VT)

TR

Signaling Hypergraphs

TR

March 3, 2014

Automatically Constructing Signaling Pathways (Graphs) Given: Interactome (graph), receptors (nodes), TRs (nodes). Goal: Identify nodes/edges that are likely to be in the signaling pathway. What We’re Really Doing: Connecting two sets of nodes in a graph. Lots of ways to do this: Rec

Rec

Shortest paths Minimum spanning tree Prize-collecting Steiner tree Network flow

TR TR

TR

TR

Random Walks ...

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

Outline 1

The Wnt Signaling Pathway Cartoon Representations Database Representations

2

Mathematical Representations of Signaling Pathways Signaling Pathways as Graphs Signaling Pathways as Hypergraphs

3

An Application: The Active Sub-Hypergraph Problem Signaling Hypergraphs Problem Formulation Integer Linear Program EGFR Signaling Results

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

Protein Complexes

Biological Representation

Graph

WNT3A

WNT3A

FZD5 LRP6 DVL

Anna Ritz (CS 6824) (VT)

WNT3A LRP6

FZD5

Hypergraph

DVL

Signaling Hypergraphs

FZD5

DVL

LRP6

March 3, 2014

Complex Assembly

Biological Representation

Graph

Hypergraph WNT3A

WNT3A

APC

FZD5

WNT3A

FZD5 DVL

APC

DVL

LRP6

LRP6

LRP6

GSK3

FZD5

Axin1

APC

WNT3A DVL GSK3

FZD5 GSK3

Axin1 APC

LRP6 Axin1

Anna Ritz (CS 6824) (VT)

DVL

Signaling Hypergraphs

GSK3 Axin1

Axin1

GSK3 FZD5

APC LRP6

DVL

WNT3A

March 3, 2014

Protein Regulation

Biological Representation

Graph

GSK3

GSK3

GSK3

Axin1 APC Axin1 β-catenin

Hypergraph Axin1

APC

APC

β-catenin β-catenin

β-catenin β-catenin

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

Wnt Signaling Pathway as a Hypergraph

WNT3A LRP6

GSK3 APC

PIP FZD5

Axin1

β-catenin

Axin1

LRP6

DVL

SMAD7

GSK3 FZD5

Axin1

DVL

SMAD7

TGF-β Signaling

5 APC

NICD

6

PIP FZD5

FZD5

FZD5 LRP6

3 2

LRP6

DVL

SMAD7

WNT3A

WNT3A

WNT3A

LRP6

GSK3

WNT3A

4

FZD5

DVL

LRP6

GSK3 Axin1

APC

DVL β-catenin

WNT3A NICD

7

NICD

1

DVL β-catenin

Notch Signaling

Anna Ritz (CS 6824) (VT)

Degraded by Proteasome

Signaling Hypergraphs

Transcriptional Regulation

March 3, 2014

Outline 1

The Wnt Signaling Pathway Cartoon Representations Database Representations

2

Mathematical Representations of Signaling Pathways Signaling Pathways as Graphs Signaling Pathways as Hypergraphs

3

An Application: The Active Sub-Hypergraph Problem Signaling Hypergraphs Problem Formulation Integer Linear Program EGFR Signaling Results

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

Signaling Hypergraphs (Mathematical Definition) A signaling hypergraph H = (V , U, E) is a tuple of three sets: The node set V is the collection of all elements (proteins, small molecules, biological processes) in the signaling pathway.

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

Signaling Hypergraphs (Mathematical Definition) A signaling hypergraph H = (V , U, E) is a tuple of three sets: The node set V is the collection of all elements (proteins, small molecules, biological processes) in the signaling pathway. A hypernode u ⊆ V is a subset of elements that act together, as a unit (e.g., a protein complex). u ∈ V may be a single element. U is the set of hypernodes.

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

Signaling Hypergraphs (Mathematical Definition) A signaling hypergraph H = (V , U, E) is a tuple of three sets: The node set V is the collection of all elements (proteins, small molecules, biological processes) in the signaling pathway. A hypernode u ⊆ V is a subset of elements that act together, as a unit (e.g., a protein complex). u ∈ V may be a single element. U is the set of hypernodes. A signaling hyperedge is a tuple (E , F , R + , R − ), composed of the tail, the head, and the positive and negative regulators. E ,F ,R + ,R − are subsets of U. F 6= ∅ E ∪ R + ∪ R − 6= ∅

E is the set of signaling hyperedges.

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

An Example: STAT Activation and Heterodimer Formation An EGFR complex activates STAT1 and STAT3 in the cytoplasm

STAT1 [cy]

[cy]+

EGFR STAT3 [cy]

Anna Ritz (CS 6824) (VT)

E

F

R+

STAT1[cy]

STAT1[cy]+

EGFR

STAT3[cy]

STAT3[cy]+

EGFR

STAT1

R-

STAT3 [cy]+

Signaling Hypergraphs

March 3, 2014

An Example: STAT Activation and Heterodimer Formation An EGFR complex activates STAT1 and STAT3 in the cytoplasm GRB2 can inhibit the activation of STAT3

STAT1

EGFR

E

F

R+

STAT1[cy]

STAT1[cy]+

EGFR

STAT3[cy]

STAT3[cy]+

EGFR

STAT1

[cy]

[cy]+

GRB2

STAT3 [cy]

Anna Ritz (CS 6824) (VT)

RGRB2

STAT3 [cy]+

Signaling Hypergraphs

March 3, 2014

An Example: STAT Activation and Heterodimer Formation An EGFR complex activates STAT1 and STAT3 in the cytoplasm GRB2 can inhibit the activation of STAT3 Activated STAT1 and STAT3 form heterodimer STAT1:STAT3

STAT1

STAT1

[cy]

EGFR

[cy]+

GRB2

STAT1 [cy]+

STAT3 [cy]+

E

F

R+

STAT1[cy]

STAT1[cy]+

EGFR

STAT3[cy]

STAT3[cy]+

EGFR

STAT1[cy]+

{STAT1[cy]+, STAT3[cy]+}

STAT3[cy]+

STAT3 [cy]

Anna Ritz (CS 6824) (VT)

RGRB2

STAT3 [cy]+

Signaling Hypergraphs

March 3, 2014

An Example: STAT Activation and Heterodimer Formation An EGFR complex activates STAT1 and STAT3 in the cytoplasm GRB2 can inhibit the activation of STAT3 Activated STAT1 and STAT3 form heterodimer STAT1:STAT3 STAT1:STAT3 translocates to the nucleus

STAT1

STAT1

[cy]

EGFR

[cy]+

STAT1 [cy]+

E

F

R+

[cy]+

STAT1[cy]

STAT1[cy]+

EGFR

STAT3[cy]

STAT3[cy]+

EGFR

STAT3

STAT1[cy]+ STAT3[cy]+

{STAT1[cy]+, STAT3[cy]+}

{STAT1[cy]+, STAT3[cy]+}

{STAT1[n]+, STAT3[n]+}

STAT3

GRB2 STAT1 [n]+

STAT3 [cy]

Anna Ritz (CS 6824) (VT)

STAT3 [cy]+

[n]+

Signaling Hypergraphs

RGRB2

March 3, 2014

An Example: STAT Activation and Heterodimer Formation An EGFR complex activates STAT1 and STAT3 in the cytoplasm GRB2 can inhibit the activation of STAT3 Activated STAT1 and STAT3 form heterodimer STAT1:STAT3 STAT1:STAT3 translocates to the nucleus One of the downstream effects of the heterodimer formation is cell migration. STAT1

STAT1

[cy]

EGFR

[cy]+

STAT1 [cy]+

E

F

R+

[cy]+

STAT1[cy]

STAT1[cy]+

EGFR

STAT3[cy]

STAT3[cy]+

EGFR

STAT3

STAT1[cy]+ STAT3[cy]+

{STAT1[cy]+, STAT3[cy]+}

{STAT1[cy]+, STAT3[cy]+}

{STAT1[n]+, STAT3[n]+}

STAT3

GRB2 STAT1 [n]+

STAT3 [cy]

[n]+

STAT3 [cy]+

Cell Migration

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

RGRB2

{STAT1[n]+, Cell Migration STAT3[n]+}

March 3, 2014

Outline 1

The Wnt Signaling Pathway Cartoon Representations Database Representations

2

Mathematical Representations of Signaling Pathways Signaling Pathways as Graphs Signaling Pathways as Hypergraphs

3

An Application: The Active Sub-Hypergraph Problem Signaling Hypergraphs Problem Formulation Integer Linear Program EGFR Signaling Results

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

The Active Sub-Hypergraph (ASH) Problem A “twist” on the pathway reconstruction problem: signaling pathway databases often include biological processes such as “cell proliferation” and “cell migration.”

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

The Active Sub-Hypergraph (ASH) Problem A “twist” on the pathway reconstruction problem: signaling pathway databases often include biological processes such as “cell proliferation” and “cell migration.” If we know that cells are proliferating, then can we compute which proteins, complexes, and reactions in the signaling pathway must be active?

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

The Active Sub-Hypergraph (ASH) Problem A “twist” on the pathway reconstruction problem: signaling pathway databases often include biological processes such as “cell proliferation” and “cell migration.” If we know that cells are proliferating, then can we compute which proteins, complexes, and reactions in the signaling pathway must be active? Can we determine which other downstream effects may be caused as a side effect of these activities?

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

The Active Sub-Hypergraph (ASH) Problem A “twist” on the pathway reconstruction problem: signaling pathway databases often include biological processes such as “cell proliferation” and “cell migration.” If we know that cells are proliferating, then can we compute which proteins, complexes, and reactions in the signaling pathway must be active? Can we determine which other downstream effects may be caused as a side effect of these activities? In general, can we compute succinct solutions to both questions after turning a set of downstream processes to be active.

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

The Active Sub-Hypergraph (ASH) Problem A “twist” on the pathway reconstruction problem: signaling pathway databases often include biological processes such as “cell proliferation” and “cell migration.” If we know that cells are proliferating, then can we compute which proteins, complexes, and reactions in the signaling pathway must be active? Can we determine which other downstream effects may be caused as a side effect of these activities? In general, can we compute succinct solutions to both questions after turning a set of downstream processes to be active.

Informally, what parts of the signaling pathway must be active in order for a given set of downstream responses to be stimulated?

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

The Active Sub-Hypergraph (ASH) Problem A “twist” on the pathway reconstruction problem: signaling pathway databases often include biological processes such as “cell proliferation” and “cell migration.” If we know that cells are proliferating, then can we compute which proteins, complexes, and reactions in the signaling pathway must be active? Can we determine which other downstream effects may be caused as a side effect of these activities? In general, can we compute succinct solutions to both questions after turning a set of downstream processes to be active.

Informally, what parts of the signaling pathway must be active in order for a given set of downstream responses to be stimulated? A set S of source hypernodes and a set T of sink hypernodes define where the signal starts and ends. Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

Alpha (α) Variables α variables denote whether each component of the signaling hypergraph is present (active) or absent. For H = (V , U, E), Node activity αv ∈ {0, 1} for all v ∈ V Hypernode activity αu ∈ {0, 1} for all u ∈ U Hyperedge activity αe ∈ {0, 1} for all e ∈ E STAT1

STAT1

[cy]

EGFR

[cy]+

STAT1 [cy]+

STAT3 [cy]+

GRB2 STAT1 [n]+

STAT3 [cy]

STAT3 [n]+

STAT3 [cy]+

Cell Migration

Anna Ritz (CS 6824) (VT)

R+

E

F

STAT1[cy]

STAT1[cy]+

EGFR

STAT3[cy]

STAT3[cy]+

EGFR

STAT1[cy]+ STAT3[cy]+

{STAT1[cy]+, STAT3[cy]+}

{STAT1[cy]+, STAT3[cy]+}

{STAT1[n]+, STAT3[n]+}

Signaling Hypergraphs

RGRB2

{STAT1[n]+, Cell Migration STAT3[n]+}

March 3, 2014

Problem Formulation

Given: Signaling hypergraph H = (V , U, E), Set S of start hypernodes, Set T of sink hypernodes, Set I of elements for which the α values are fixed to 0 or 1.

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

Problem Formulation

Given: Signaling hypergraph H = (V , U, E), Set S of start hypernodes, Set T of sink hypernodes, Set I of elements for which the α values are fixed to 0 or 1.

Goal: Find an assignment of the α values that contains the fewest number of active hyperedges subject to (1) activity constraints, (2) connectivity constraints, and (3) input constraints.

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

Outline 1

The Wnt Signaling Pathway Cartoon Representations Database Representations

2

Mathematical Representations of Signaling Pathways Signaling Pathways as Graphs Signaling Pathways as Hypergraphs

3

An Application: The Active Sub-Hypergraph Problem Signaling Hypergraphs Problem Formulation Integer Linear Program EGFR Signaling Results

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

Integer Linear Programs

ILPs are mathematical optimization programs; for a vector x of variables, an ILP can be written as maximize

cT x

subject to

Ax ≤ b

with bounds

linear objective function linear constraints

x ≥ 0, x ∈ Z

integer variables

where c and b are vectors and A is a matrix.

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

Integer Linear Programs

ILPs are mathematical optimization programs; for a vector x of variables, an ILP can be written as maximize

cT x

subject to

Ax ≤ b

with bounds

linear objective function linear constraints

x ≥ 0, x ∈ Z

integer variables

where c and b are vectors and A is a matrix. ILPs are NP-hard (0-1 Integer Programming one of Karp’s 21 NP-complete problems). We use solvers (CPLEX, lpsolve) to find optimal solutions to ILPs.

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

The ASH Problem as an ILP The ILP Objective Function: Goal: Find an assignment of the α values that contains the fewest number of active hyperedges subject to (1) activity constraints, (2) connectivity constraints, and (3) input constraints. min

Anna Ritz (CS 6824) (VT)

P

e∈E

αe

Signaling Hypergraphs

March 3, 2014

The ASH Problem as an ILP The ILP Objective Function: Goal: Find an assignment of the α values that contains the fewest number of active hyperedges subject to (1) activity constraints, (2) connectivity constraints, and (3) input constraints. min

P

e∈E

αe

The ILP Constraints: Goal: Find an assignment of the α values that contains the fewest number of active hyperedges subject to (1) activity constraints, (2) connectivity constraints, and (3) input constraints. (Next slides)

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

Activity Constraints STAT1

STAT1

[cy]

EGFR

[cy]+

GRB2

STAT3 [cy]

Signaling Hypergraph Representation STAT1 [cy]+

STAT3 [cy]+

STAT1 [n]+

STAT3 [n]+

Cell Migration

STAT3 [cy]+

A hypernode u ∈ U is active only if all of its nodes are active. Suppose u contains v1 ,v2 ,v3 .

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

Activity Constraints STAT1

STAT1

[cy]

EGFR

[cy]+

GRB2

STAT3 [cy]

Signaling Hypergraph Representation STAT1 [cy]+

STAT3 [cy]+

STAT1 [n]+

STAT3 [n]+

Cell Migration

STAT3 [cy]+

A hypernode u ∈ U is active only if all of its nodes are active. Suppose u contains v1 ,v2 ,v3 . αv1 + αv2 + αv3 ≥ 3αu

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

Activity Constraints STAT1

STAT1

[cy]

EGFR

[cy]+

GRB2

STAT3 [cy]

Signaling Hypergraph Representation STAT1 [cy]+

STAT3 [cy]+

STAT1 [n]+

STAT3 [n]+

Cell Migration

STAT3 [cy]+

A hypernode u ∈ U is active only if all of its nodes are active. Suppose u contains v1 ,v2 ,v3 . αv1 + αv2 + αv3 ≥ 3αu We use similar thinking as above to linearize the next activity constraint: A hyperedge e ∈ E is active if all of the hypernodes in the tail E , the head F , and the positive regulators R + are active and if none of the hypernodes in the negative regulators R − are active. Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

Connectivity & Input Constraints

STAT1

STAT1

[cy]

EGFR

[cy]+

GRB2

STAT3 [cy]

Signaling Hypergraph Representation STAT1 [cy]+

STAT3 [cy]+

STAT1 [n]+

STAT3 [n]+

Cell Migration

STAT3 [cy]+

Connectivity Constraints: Active hypernodes (that are not in S or T ) must have at least one incoming hyperedge and outgoing hyperedge.

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

Connectivity & Input Constraints

STAT1

STAT1

[cy]

EGFR

[cy]+

GRB2

STAT3 [cy]

Signaling Hypergraph Representation STAT1 [cy]+

STAT3 [cy]+

STAT1 [n]+

STAT3 [n]+

Cell Migration

STAT3 [cy]+

Connectivity Constraints: Active hypernodes (that are not in S or T ) must have at least one incoming hyperedge and outgoing hyperedge. Input Constraints: Let I be the set of elements in G for which the activity is predetermined. For each i ∈ I , set the αi variable to be the predetermined activity value.

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

The ASH Problem as an ILP The ILP Objective Function: Goal: Find an assignment of the α values that contains the fewest number of active hyperedges subject to (1) activity constraints, (2) connectivity constraints, and (3) input constraints. min

P

e∈E

αe

The ILP Constraints: Goal: Find an assignment of the α values that contains the fewest number of active hyperedges subject to (1) activity constraints, (2) connectivity constraints, and (3) input constraints.

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

Outline 1

The Wnt Signaling Pathway Cartoon Representations Database Representations

2

Mathematical Representations of Signaling Pathways Signaling Pathways as Graphs Signaling Pathways as Hypergraphs

3

An Application: The Active Sub-Hypergraph Problem Signaling Hypergraphs Problem Formulation Integer Linear Program EGFR Signaling Results

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

Epidermal Growth Factor Receptor (EGFR) Signaling

Citri et al. Nature Reviews Molecular Cell Biology 7, 505–516 (July 2006) | doi:10.1038/nrm1962 Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

EGFR Signaling Hypergraph H = (V , U, E) National Cancer Institute’s Pathway Interaction Database (NCI-PID) Node Set: 471 nodes in V .

Hypernode Set: 522 hypernodes in U. 171 (33%) contain multiple nodes from V (complexes).

Hyperedge Set: 376 hyperedges in E. 216 (57%) involve a positive or negative regulator. 104 (28%) are unregulated hyperedges. 56 (15%) directly regulate a biological process.

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

EGFR Signaling Hypergraph H = (V , U, E) National Cancer Institute’s Pathway Interaction Database (NCI-PID) Node Set: 471 nodes in V .

Hypernode Set: 522 hypernodes in U. 171 (33%) contain multiple nodes from V (complexes).

Hyperedge Set: 376 hyperedges in E. 216 (57%) involve a positive or negative regulator. 104 (28%) are unregulated hyperedges. 56 (15%) directly regulate a biological process.

S are all hypernodes with no incoming hyperedges. T are all hypernodes with no outgoing hyperedges.

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

Solving the ASH Problem for Cell Proliferation Let I = {“cell proliferation”} be a single biological process with α value 1. Given H, S, T , and I , solve the ASH Problem. (A)

(B)

HSP90 dimer

EGF BTC

ErbB4

EGFR

EGFR

ErbB4

ErbB4

BTC

BTC

+

ErbB2

ErbB2

HSP90 dimer

ErbB2

EGF

+

ErbB4

ErbB2

HBEGF cell proliferation cell proliferation

ErbB4

ErbB2

HBEGF

+

(A) Optimal and (B) Sub-Optimal Solutions. Blue hypernodes are in S and purple hypernodes are in T .

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

Pathway-Wide Analysis of Biological Processes

For each of the 24 biological processes: Let I be the single biological process with α value 1. Given H, S, T , and I , solve the ASH Problem. For biological processes A and B, compute the assymetric Jaccard Index of the hyperedges from A’s and B’s optimal solutions. Jaccard Index (JI):

|A’s Hyperedges ∩ B’s Hyperedges| |A’s Hyperedges|

Are biological processes with high JI values related?

Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

heart-development(8) axon-guidance(9) neural-crest-cell-migrati(8) mammary-gland-morphogenes(9) heart-morphogenesis(9) nervous-system-developmen(9) clathrin-coat-assembly(31) cortical-actin-cytoskelet(30) ruffle-organization(31) cytoskeleton-organization(29) myelination(8) dendrite-morphogenesis(5) chemotaxis(5) neuron-projection-morphog(3) translational-initiation(8) tight-junction-assembly(4) cell-migration(3) lamellipodium-assembly(3)

Pathway-Wide Analysis of Biological Processes

lamellipodium-assembly(3) cell-migration(3) tight-junction-assembly(4) translational-initiation(8) neuron-projection-morphog(3) chemotaxis(5) dendrite-morphogenesis(5) myelination(8) cytoskeleton-organization(29) ruffle-organization(31) cortical-actin-cytoskelet(30) clathrin-coat-assembly(31) nervous-system-developmen(9) heart-morphogenesis(9) mammary-gland-morphogenes(9) neural-crest-cell-migrati(8) axon-guidance(9) heart-development(8) 0.0

Anna Ritz (CS 6824) (VT)

0.2

0.4

Signaling Hypergraphs

0.6

0.8

1.0

March 3, 2014

Lung Cancer Cell Lines

The Short Story: Over 60% of non-small-cell lung carcinomas (NSCLC) express EGFR EGFR inhibitors have been developed (Gefitinib) A small portion of EGFR-expressing NSCLC cases respond to the EGFR inhibitors

Guo et. al., Signaling Networks Assembled by Oncogenic EGFR and c-Met. PNAS 2007. Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

Lung Cancer Cell Lines

The Short Story: Over 60% of non-small-cell lung carcinomas (NSCLC) express EGFR EGFR inhibitors have been developed (Gefitinib) A small portion of EGFR-expressing NSCLC cases respond to the EGFR inhibitors

Phosphoproteomics data from four NSCLC cell lines Proteins present in Gefitinib-sensitive and Gefitinib-resistant cell lines: EGFR, FAK, GAB1, PKCD, SHC Proteins present in Gefitinib-resistant cell lines only: FYN, SRC Proteins present in Gefitinib-sensitive cell lines only: CBL, ErbB2, ErbB3, TLN1, SHP2

Guo et. al., Signaling Networks Assembled by Oncogenic EGFR and c-Met. PNAS 2007. Anna Ritz (CS 6824) (VT)

Signaling Hypergraphs

March 3, 2014

Pathway Analysis of Gefitinib-Resistant and Gefitinib-Sensitive Cell Lines 1 CIN85

PIP5K1C

LRIG1

PIP5K1C

CBL[cy]

1

SH3GL2

CBL[pm]

EGFR

EGFR

EGF

EGF

1

TLN1 PIP5K1C

TLN1

1 CIN85

EPS15[cy]

CBL+[pm]

SH3GL2

[pm]

EGFR

EGFR

EGF

EGF

cell migration EPS15[pm]

SYNJ1

EPN1[pm] AMPH

GefitinibSensitive

DMN1

Anna Ritz (CS 6824) (VT)

GTP

EGFR

EGFR

EGF

EGF

[endosome]

Signaling Hypergraphs

Gefitinib-Resistant March 3, 2014