Signaling Hypergraphs: A New Representation for Biological Signaling Pathways Anna Ritz CS 6824: Hypergraphs Virginia Tech
March 3, 2014
Biological Signaling Pathways
http://en.wikipedia.org/wiki/Central dogma of molecular biology Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
Biological Signaling Pathways
Cellular Signaling Cellular Communication “Message passing” mechanism of the cell Messages pass between proteins to regulate gene transcription
http://en.wikipedia.org/wiki/Central dogma of molecular biology Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
Biological Signaling Pathways
Cellular Signaling Cellular Communication “Message passing” mechanism of the cell Messages pass between proteins to regulate gene transcription Wnt Signaling Pathway Embryonic development Associated with cancer Initiated by the Wnt protein
http://en.wikipedia.org/wiki/Central dogma of molecular biology Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
Outline 1
The Wnt Signaling Pathway Cartoon Representations Database Representations
2
Mathematical Representations of Signaling Pathways Signaling Pathways as Graphs Signaling Pathways as Hypergraphs
3
An Application: The Active Sub-Hypergraph Problem Signaling Hypergraphs Problem Formulation Integer Linear Program EGFR Signaling Results
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
In the Absence of Wnt Signaling
Fzd Cell Surface Axin GSK APC
Anna Ritz (CS 6824) (VT)
B-Catenin
Signaling Hypergraphs
March 3, 2014
In the Absence of Wnt Signaling
Fzd Cell Surface p Axin GSK APC B-Catenin
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
B-Catenin
March 3, 2014
In the Absence of Wnt Signaling
Fzd Cell Surface p Axin GSK APC B-Catenin
B-Catenin p
B-Catenin
Degraded B-Catenin Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
In the Absence of Wnt Signaling
Fzd Cell Surface p Axin GSK APC B-Catenin "Destruction Complex"
B-Catenin p
B-Catenin
Degraded B-Catenin Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
In the Presence of Wnt Signaling Wnt
Wnt
Wnt
Fzd
Wnt Cell Surface
Axin GSK APC
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
B-Catenin
March 3, 2014
In the Presence of Wnt Signaling Wnt
Wnt
Wnt
Fzd Axin GSK
Wnt Cell Surface
B-Catenin
APC
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
In the Presence of Wnt Signaling Wnt
Wnt
Wnt
Fzd Axin GSK
APC
Cell Surface B-Catenin
B-Catenin
B-Catenin B-Catenin
Anna Ritz (CS 6824) (VT)
Wnt
Signaling Hypergraphs
B-Catenin
March 3, 2014
In the Presence of Wnt Signaling Wnt
Wnt
Wnt
Fzd Axin GSK
APC
Wnt Cell Surface
B-Catenin
B-Catenin
B-Catenin B-Catenin
B-Catenin
Regulates Gene Transcription
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
Wnt Signaling
C. Y. Logan and R. Nusse, The Wnt Signaling Pathway in Development and Disease. Annu. Rev. Cell Dev. Biol. 2004.
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
Wnt Signaling
C. Y. Logan and R. Nusse, The Wnt Signaling Pathway in Development and Disease. Annu. Rev. Cell Dev. Biol. 2004. http://www.biolegend.com Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
Wnt Signaling
C. Y. Logan and R. Nusse, The Wnt Signaling Pathway in Development and Disease. Annu. Rev. Cell Dev. Biol. 2004. http://www.biolegend.com
”
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
We have lots of data available about these signaling pathways. Can we develop computational methods to analyze them? Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
Outline 1
The Wnt Signaling Pathway Cartoon Representations Database Representations
2
Mathematical Representations of Signaling Pathways Signaling Pathways as Graphs Signaling Pathways as Hypergraphs
3
An Application: The Active Sub-Hypergraph Problem Signaling Hypergraphs Problem Formulation Integer Linear Program EGFR Signaling Results
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
NetPath Signaling Database
Destruction Complex
www.netpath.org/netslim Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
KEGG Signaling Database
www.kegg.jp Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
Reactome Database
www.reactome.org Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
Reactome Database
www.reactome.org Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
Outline 1
The Wnt Signaling Pathway Cartoon Representations Database Representations
2
Mathematical Representations of Signaling Pathways Signaling Pathways as Graphs Signaling Pathways as Hypergraphs
3
An Application: The Active Sub-Hypergraph Problem Signaling Hypergraphs Problem Formulation Integer Linear Program EGFR Signaling Results
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
Graph Representation of Signaling Pathways
Create a directed graph Nodes are proteins Edges are interactions Destruction Complex
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
Graph Representation of Signaling Pathways
Create a directed graph Nodes are proteins Edges are interactions Destruction Complex
Anna Ritz (CS 6824) (VT)
Edge directionality determined by the type of protein interaction
Signaling Hypergraphs
March 3, 2014
Graph Representation of Signaling Pathways
Rec
Rec
Destruction Complex
TR TR
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
TR
TR
March 3, 2014
Graph Representation of the Human Interactome Directed graph constructed from multiple public databases describing all protein interactions in human cells. Graph may be weighted. 11,266 nodes 129,900 edges 16,792 directed interactions 56,554 bidirected interactions
Each edge in the human interactome is supported by the literature.
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
Problem Setup (Graphs)
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
Problem Setup (Graphs)
Rec
Rec
TR TR
Anna Ritz (CS 6824) (VT)
TR
Signaling Hypergraphs
TR
March 3, 2014
Problem Setup (Graphs)
Rec
Rec
TR TR
Anna Ritz (CS 6824) (VT)
TR
Signaling Hypergraphs
TR
March 3, 2014
Automatically Constructing Signaling Pathways (Graphs) Given: Interactome (graph), receptors (nodes), TRs (nodes). Goal: Identify nodes/edges that are likely to be in the signaling pathway. What We’re Really Doing: Connecting two sets of nodes in a graph. Lots of ways to do this: Rec
Rec
Shortest paths Minimum spanning tree Prize-collecting Steiner tree Network flow
TR TR
TR
TR
Random Walks ...
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
Outline 1
The Wnt Signaling Pathway Cartoon Representations Database Representations
2
Mathematical Representations of Signaling Pathways Signaling Pathways as Graphs Signaling Pathways as Hypergraphs
3
An Application: The Active Sub-Hypergraph Problem Signaling Hypergraphs Problem Formulation Integer Linear Program EGFR Signaling Results
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
Protein Complexes
Biological Representation
Graph
WNT3A
WNT3A
FZD5 LRP6 DVL
Anna Ritz (CS 6824) (VT)
WNT3A LRP6
FZD5
Hypergraph
DVL
Signaling Hypergraphs
FZD5
DVL
LRP6
March 3, 2014
Complex Assembly
Biological Representation
Graph
Hypergraph WNT3A
WNT3A
APC
FZD5
WNT3A
FZD5 DVL
APC
DVL
LRP6
LRP6
LRP6
GSK3
FZD5
Axin1
APC
WNT3A DVL GSK3
FZD5 GSK3
Axin1 APC
LRP6 Axin1
Anna Ritz (CS 6824) (VT)
DVL
Signaling Hypergraphs
GSK3 Axin1
Axin1
GSK3 FZD5
APC LRP6
DVL
WNT3A
March 3, 2014
Protein Regulation
Biological Representation
Graph
GSK3
GSK3
GSK3
Axin1 APC Axin1 β-catenin
Hypergraph Axin1
APC
APC
β-catenin β-catenin
β-catenin β-catenin
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
Wnt Signaling Pathway as a Hypergraph
WNT3A LRP6
GSK3 APC
PIP FZD5
Axin1
β-catenin
Axin1
LRP6
DVL
SMAD7
GSK3 FZD5
Axin1
DVL
SMAD7
TGF-β Signaling
5 APC
NICD
6
PIP FZD5
FZD5
FZD5 LRP6
3 2
LRP6
DVL
SMAD7
WNT3A
WNT3A
WNT3A
LRP6
GSK3
WNT3A
4
FZD5
DVL
LRP6
GSK3 Axin1
APC
DVL β-catenin
WNT3A NICD
7
NICD
1
DVL β-catenin
Notch Signaling
Anna Ritz (CS 6824) (VT)
Degraded by Proteasome
Signaling Hypergraphs
Transcriptional Regulation
March 3, 2014
Outline 1
The Wnt Signaling Pathway Cartoon Representations Database Representations
2
Mathematical Representations of Signaling Pathways Signaling Pathways as Graphs Signaling Pathways as Hypergraphs
3
An Application: The Active Sub-Hypergraph Problem Signaling Hypergraphs Problem Formulation Integer Linear Program EGFR Signaling Results
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
Signaling Hypergraphs (Mathematical Definition) A signaling hypergraph H = (V , U, E) is a tuple of three sets: The node set V is the collection of all elements (proteins, small molecules, biological processes) in the signaling pathway.
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
Signaling Hypergraphs (Mathematical Definition) A signaling hypergraph H = (V , U, E) is a tuple of three sets: The node set V is the collection of all elements (proteins, small molecules, biological processes) in the signaling pathway. A hypernode u ⊆ V is a subset of elements that act together, as a unit (e.g., a protein complex). u ∈ V may be a single element. U is the set of hypernodes.
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
Signaling Hypergraphs (Mathematical Definition) A signaling hypergraph H = (V , U, E) is a tuple of three sets: The node set V is the collection of all elements (proteins, small molecules, biological processes) in the signaling pathway. A hypernode u ⊆ V is a subset of elements that act together, as a unit (e.g., a protein complex). u ∈ V may be a single element. U is the set of hypernodes. A signaling hyperedge is a tuple (E , F , R + , R − ), composed of the tail, the head, and the positive and negative regulators. E ,F ,R + ,R − are subsets of U. F 6= ∅ E ∪ R + ∪ R − 6= ∅
E is the set of signaling hyperedges.
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
An Example: STAT Activation and Heterodimer Formation An EGFR complex activates STAT1 and STAT3 in the cytoplasm
STAT1 [cy]
[cy]+
EGFR STAT3 [cy]
Anna Ritz (CS 6824) (VT)
E
F
R+
STAT1[cy]
STAT1[cy]+
EGFR
STAT3[cy]
STAT3[cy]+
EGFR
STAT1
R-
STAT3 [cy]+
Signaling Hypergraphs
March 3, 2014
An Example: STAT Activation and Heterodimer Formation An EGFR complex activates STAT1 and STAT3 in the cytoplasm GRB2 can inhibit the activation of STAT3
STAT1
EGFR
E
F
R+
STAT1[cy]
STAT1[cy]+
EGFR
STAT3[cy]
STAT3[cy]+
EGFR
STAT1
[cy]
[cy]+
GRB2
STAT3 [cy]
Anna Ritz (CS 6824) (VT)
RGRB2
STAT3 [cy]+
Signaling Hypergraphs
March 3, 2014
An Example: STAT Activation and Heterodimer Formation An EGFR complex activates STAT1 and STAT3 in the cytoplasm GRB2 can inhibit the activation of STAT3 Activated STAT1 and STAT3 form heterodimer STAT1:STAT3
STAT1
STAT1
[cy]
EGFR
[cy]+
GRB2
STAT1 [cy]+
STAT3 [cy]+
E
F
R+
STAT1[cy]
STAT1[cy]+
EGFR
STAT3[cy]
STAT3[cy]+
EGFR
STAT1[cy]+
{STAT1[cy]+, STAT3[cy]+}
STAT3[cy]+
STAT3 [cy]
Anna Ritz (CS 6824) (VT)
RGRB2
STAT3 [cy]+
Signaling Hypergraphs
March 3, 2014
An Example: STAT Activation and Heterodimer Formation An EGFR complex activates STAT1 and STAT3 in the cytoplasm GRB2 can inhibit the activation of STAT3 Activated STAT1 and STAT3 form heterodimer STAT1:STAT3 STAT1:STAT3 translocates to the nucleus
STAT1
STAT1
[cy]
EGFR
[cy]+
STAT1 [cy]+
E
F
R+
[cy]+
STAT1[cy]
STAT1[cy]+
EGFR
STAT3[cy]
STAT3[cy]+
EGFR
STAT3
STAT1[cy]+ STAT3[cy]+
{STAT1[cy]+, STAT3[cy]+}
{STAT1[cy]+, STAT3[cy]+}
{STAT1[n]+, STAT3[n]+}
STAT3
GRB2 STAT1 [n]+
STAT3 [cy]
Anna Ritz (CS 6824) (VT)
STAT3 [cy]+
[n]+
Signaling Hypergraphs
RGRB2
March 3, 2014
An Example: STAT Activation and Heterodimer Formation An EGFR complex activates STAT1 and STAT3 in the cytoplasm GRB2 can inhibit the activation of STAT3 Activated STAT1 and STAT3 form heterodimer STAT1:STAT3 STAT1:STAT3 translocates to the nucleus One of the downstream effects of the heterodimer formation is cell migration. STAT1
STAT1
[cy]
EGFR
[cy]+
STAT1 [cy]+
E
F
R+
[cy]+
STAT1[cy]
STAT1[cy]+
EGFR
STAT3[cy]
STAT3[cy]+
EGFR
STAT3
STAT1[cy]+ STAT3[cy]+
{STAT1[cy]+, STAT3[cy]+}
{STAT1[cy]+, STAT3[cy]+}
{STAT1[n]+, STAT3[n]+}
STAT3
GRB2 STAT1 [n]+
STAT3 [cy]
[n]+
STAT3 [cy]+
Cell Migration
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
RGRB2
{STAT1[n]+, Cell Migration STAT3[n]+}
March 3, 2014
Outline 1
The Wnt Signaling Pathway Cartoon Representations Database Representations
2
Mathematical Representations of Signaling Pathways Signaling Pathways as Graphs Signaling Pathways as Hypergraphs
3
An Application: The Active Sub-Hypergraph Problem Signaling Hypergraphs Problem Formulation Integer Linear Program EGFR Signaling Results
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
The Active Sub-Hypergraph (ASH) Problem A “twist” on the pathway reconstruction problem: signaling pathway databases often include biological processes such as “cell proliferation” and “cell migration.”
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
The Active Sub-Hypergraph (ASH) Problem A “twist” on the pathway reconstruction problem: signaling pathway databases often include biological processes such as “cell proliferation” and “cell migration.” If we know that cells are proliferating, then can we compute which proteins, complexes, and reactions in the signaling pathway must be active?
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
The Active Sub-Hypergraph (ASH) Problem A “twist” on the pathway reconstruction problem: signaling pathway databases often include biological processes such as “cell proliferation” and “cell migration.” If we know that cells are proliferating, then can we compute which proteins, complexes, and reactions in the signaling pathway must be active? Can we determine which other downstream effects may be caused as a side effect of these activities?
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
The Active Sub-Hypergraph (ASH) Problem A “twist” on the pathway reconstruction problem: signaling pathway databases often include biological processes such as “cell proliferation” and “cell migration.” If we know that cells are proliferating, then can we compute which proteins, complexes, and reactions in the signaling pathway must be active? Can we determine which other downstream effects may be caused as a side effect of these activities? In general, can we compute succinct solutions to both questions after turning a set of downstream processes to be active.
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
The Active Sub-Hypergraph (ASH) Problem A “twist” on the pathway reconstruction problem: signaling pathway databases often include biological processes such as “cell proliferation” and “cell migration.” If we know that cells are proliferating, then can we compute which proteins, complexes, and reactions in the signaling pathway must be active? Can we determine which other downstream effects may be caused as a side effect of these activities? In general, can we compute succinct solutions to both questions after turning a set of downstream processes to be active.
Informally, what parts of the signaling pathway must be active in order for a given set of downstream responses to be stimulated?
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
The Active Sub-Hypergraph (ASH) Problem A “twist” on the pathway reconstruction problem: signaling pathway databases often include biological processes such as “cell proliferation” and “cell migration.” If we know that cells are proliferating, then can we compute which proteins, complexes, and reactions in the signaling pathway must be active? Can we determine which other downstream effects may be caused as a side effect of these activities? In general, can we compute succinct solutions to both questions after turning a set of downstream processes to be active.
Informally, what parts of the signaling pathway must be active in order for a given set of downstream responses to be stimulated? A set S of source hypernodes and a set T of sink hypernodes define where the signal starts and ends. Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
Alpha (α) Variables α variables denote whether each component of the signaling hypergraph is present (active) or absent. For H = (V , U, E), Node activity αv ∈ {0, 1} for all v ∈ V Hypernode activity αu ∈ {0, 1} for all u ∈ U Hyperedge activity αe ∈ {0, 1} for all e ∈ E STAT1
STAT1
[cy]
EGFR
[cy]+
STAT1 [cy]+
STAT3 [cy]+
GRB2 STAT1 [n]+
STAT3 [cy]
STAT3 [n]+
STAT3 [cy]+
Cell Migration
Anna Ritz (CS 6824) (VT)
R+
E
F
STAT1[cy]
STAT1[cy]+
EGFR
STAT3[cy]
STAT3[cy]+
EGFR
STAT1[cy]+ STAT3[cy]+
{STAT1[cy]+, STAT3[cy]+}
{STAT1[cy]+, STAT3[cy]+}
{STAT1[n]+, STAT3[n]+}
Signaling Hypergraphs
RGRB2
{STAT1[n]+, Cell Migration STAT3[n]+}
March 3, 2014
Problem Formulation
Given: Signaling hypergraph H = (V , U, E), Set S of start hypernodes, Set T of sink hypernodes, Set I of elements for which the α values are fixed to 0 or 1.
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
Problem Formulation
Given: Signaling hypergraph H = (V , U, E), Set S of start hypernodes, Set T of sink hypernodes, Set I of elements for which the α values are fixed to 0 or 1.
Goal: Find an assignment of the α values that contains the fewest number of active hyperedges subject to (1) activity constraints, (2) connectivity constraints, and (3) input constraints.
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
Outline 1
The Wnt Signaling Pathway Cartoon Representations Database Representations
2
Mathematical Representations of Signaling Pathways Signaling Pathways as Graphs Signaling Pathways as Hypergraphs
3
An Application: The Active Sub-Hypergraph Problem Signaling Hypergraphs Problem Formulation Integer Linear Program EGFR Signaling Results
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
Integer Linear Programs
ILPs are mathematical optimization programs; for a vector x of variables, an ILP can be written as maximize
cT x
subject to
Ax ≤ b
with bounds
linear objective function linear constraints
x ≥ 0, x ∈ Z
integer variables
where c and b are vectors and A is a matrix.
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
Integer Linear Programs
ILPs are mathematical optimization programs; for a vector x of variables, an ILP can be written as maximize
cT x
subject to
Ax ≤ b
with bounds
linear objective function linear constraints
x ≥ 0, x ∈ Z
integer variables
where c and b are vectors and A is a matrix. ILPs are NP-hard (0-1 Integer Programming one of Karp’s 21 NP-complete problems). We use solvers (CPLEX, lpsolve) to find optimal solutions to ILPs.
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
The ASH Problem as an ILP The ILP Objective Function: Goal: Find an assignment of the α values that contains the fewest number of active hyperedges subject to (1) activity constraints, (2) connectivity constraints, and (3) input constraints. min
Anna Ritz (CS 6824) (VT)
P
e∈E
αe
Signaling Hypergraphs
March 3, 2014
The ASH Problem as an ILP The ILP Objective Function: Goal: Find an assignment of the α values that contains the fewest number of active hyperedges subject to (1) activity constraints, (2) connectivity constraints, and (3) input constraints. min
P
e∈E
αe
The ILP Constraints: Goal: Find an assignment of the α values that contains the fewest number of active hyperedges subject to (1) activity constraints, (2) connectivity constraints, and (3) input constraints. (Next slides)
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
Activity Constraints STAT1
STAT1
[cy]
EGFR
[cy]+
GRB2
STAT3 [cy]
Signaling Hypergraph Representation STAT1 [cy]+
STAT3 [cy]+
STAT1 [n]+
STAT3 [n]+
Cell Migration
STAT3 [cy]+
A hypernode u ∈ U is active only if all of its nodes are active. Suppose u contains v1 ,v2 ,v3 .
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
Activity Constraints STAT1
STAT1
[cy]
EGFR
[cy]+
GRB2
STAT3 [cy]
Signaling Hypergraph Representation STAT1 [cy]+
STAT3 [cy]+
STAT1 [n]+
STAT3 [n]+
Cell Migration
STAT3 [cy]+
A hypernode u ∈ U is active only if all of its nodes are active. Suppose u contains v1 ,v2 ,v3 . αv1 + αv2 + αv3 ≥ 3αu
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
Activity Constraints STAT1
STAT1
[cy]
EGFR
[cy]+
GRB2
STAT3 [cy]
Signaling Hypergraph Representation STAT1 [cy]+
STAT3 [cy]+
STAT1 [n]+
STAT3 [n]+
Cell Migration
STAT3 [cy]+
A hypernode u ∈ U is active only if all of its nodes are active. Suppose u contains v1 ,v2 ,v3 . αv1 + αv2 + αv3 ≥ 3αu We use similar thinking as above to linearize the next activity constraint: A hyperedge e ∈ E is active if all of the hypernodes in the tail E , the head F , and the positive regulators R + are active and if none of the hypernodes in the negative regulators R − are active. Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
Connectivity & Input Constraints
STAT1
STAT1
[cy]
EGFR
[cy]+
GRB2
STAT3 [cy]
Signaling Hypergraph Representation STAT1 [cy]+
STAT3 [cy]+
STAT1 [n]+
STAT3 [n]+
Cell Migration
STAT3 [cy]+
Connectivity Constraints: Active hypernodes (that are not in S or T ) must have at least one incoming hyperedge and outgoing hyperedge.
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
Connectivity & Input Constraints
STAT1
STAT1
[cy]
EGFR
[cy]+
GRB2
STAT3 [cy]
Signaling Hypergraph Representation STAT1 [cy]+
STAT3 [cy]+
STAT1 [n]+
STAT3 [n]+
Cell Migration
STAT3 [cy]+
Connectivity Constraints: Active hypernodes (that are not in S or T ) must have at least one incoming hyperedge and outgoing hyperedge. Input Constraints: Let I be the set of elements in G for which the activity is predetermined. For each i ∈ I , set the αi variable to be the predetermined activity value.
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
The ASH Problem as an ILP The ILP Objective Function: Goal: Find an assignment of the α values that contains the fewest number of active hyperedges subject to (1) activity constraints, (2) connectivity constraints, and (3) input constraints. min
P
e∈E
αe
The ILP Constraints: Goal: Find an assignment of the α values that contains the fewest number of active hyperedges subject to (1) activity constraints, (2) connectivity constraints, and (3) input constraints.
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
Outline 1
The Wnt Signaling Pathway Cartoon Representations Database Representations
2
Mathematical Representations of Signaling Pathways Signaling Pathways as Graphs Signaling Pathways as Hypergraphs
3
An Application: The Active Sub-Hypergraph Problem Signaling Hypergraphs Problem Formulation Integer Linear Program EGFR Signaling Results
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
Epidermal Growth Factor Receptor (EGFR) Signaling
Citri et al. Nature Reviews Molecular Cell Biology 7, 505–516 (July 2006) | doi:10.1038/nrm1962 Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
EGFR Signaling Hypergraph H = (V , U, E) National Cancer Institute’s Pathway Interaction Database (NCI-PID) Node Set: 471 nodes in V .
Hypernode Set: 522 hypernodes in U. 171 (33%) contain multiple nodes from V (complexes).
Hyperedge Set: 376 hyperedges in E. 216 (57%) involve a positive or negative regulator. 104 (28%) are unregulated hyperedges. 56 (15%) directly regulate a biological process.
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
EGFR Signaling Hypergraph H = (V , U, E) National Cancer Institute’s Pathway Interaction Database (NCI-PID) Node Set: 471 nodes in V .
Hypernode Set: 522 hypernodes in U. 171 (33%) contain multiple nodes from V (complexes).
Hyperedge Set: 376 hyperedges in E. 216 (57%) involve a positive or negative regulator. 104 (28%) are unregulated hyperedges. 56 (15%) directly regulate a biological process.
S are all hypernodes with no incoming hyperedges. T are all hypernodes with no outgoing hyperedges.
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
Solving the ASH Problem for Cell Proliferation Let I = {“cell proliferation”} be a single biological process with α value 1. Given H, S, T , and I , solve the ASH Problem. (A)
(B)
HSP90 dimer
EGF BTC
ErbB4
EGFR
EGFR
ErbB4
ErbB4
BTC
BTC
+
ErbB2
ErbB2
HSP90 dimer
ErbB2
EGF
+
ErbB4
ErbB2
HBEGF cell proliferation cell proliferation
ErbB4
ErbB2
HBEGF
+
(A) Optimal and (B) Sub-Optimal Solutions. Blue hypernodes are in S and purple hypernodes are in T .
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
Pathway-Wide Analysis of Biological Processes
For each of the 24 biological processes: Let I be the single biological process with α value 1. Given H, S, T , and I , solve the ASH Problem. For biological processes A and B, compute the assymetric Jaccard Index of the hyperedges from A’s and B’s optimal solutions. Jaccard Index (JI):
|A’s Hyperedges ∩ B’s Hyperedges| |A’s Hyperedges|
Are biological processes with high JI values related?
Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
heart-development(8) axon-guidance(9) neural-crest-cell-migrati(8) mammary-gland-morphogenes(9) heart-morphogenesis(9) nervous-system-developmen(9) clathrin-coat-assembly(31) cortical-actin-cytoskelet(30) ruffle-organization(31) cytoskeleton-organization(29) myelination(8) dendrite-morphogenesis(5) chemotaxis(5) neuron-projection-morphog(3) translational-initiation(8) tight-junction-assembly(4) cell-migration(3) lamellipodium-assembly(3)
Pathway-Wide Analysis of Biological Processes
lamellipodium-assembly(3) cell-migration(3) tight-junction-assembly(4) translational-initiation(8) neuron-projection-morphog(3) chemotaxis(5) dendrite-morphogenesis(5) myelination(8) cytoskeleton-organization(29) ruffle-organization(31) cortical-actin-cytoskelet(30) clathrin-coat-assembly(31) nervous-system-developmen(9) heart-morphogenesis(9) mammary-gland-morphogenes(9) neural-crest-cell-migrati(8) axon-guidance(9) heart-development(8) 0.0
Anna Ritz (CS 6824) (VT)
0.2
0.4
Signaling Hypergraphs
0.6
0.8
1.0
March 3, 2014
Lung Cancer Cell Lines
The Short Story: Over 60% of non-small-cell lung carcinomas (NSCLC) express EGFR EGFR inhibitors have been developed (Gefitinib) A small portion of EGFR-expressing NSCLC cases respond to the EGFR inhibitors
Guo et. al., Signaling Networks Assembled by Oncogenic EGFR and c-Met. PNAS 2007. Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
Lung Cancer Cell Lines
The Short Story: Over 60% of non-small-cell lung carcinomas (NSCLC) express EGFR EGFR inhibitors have been developed (Gefitinib) A small portion of EGFR-expressing NSCLC cases respond to the EGFR inhibitors
Phosphoproteomics data from four NSCLC cell lines Proteins present in Gefitinib-sensitive and Gefitinib-resistant cell lines: EGFR, FAK, GAB1, PKCD, SHC Proteins present in Gefitinib-resistant cell lines only: FYN, SRC Proteins present in Gefitinib-sensitive cell lines only: CBL, ErbB2, ErbB3, TLN1, SHP2
Guo et. al., Signaling Networks Assembled by Oncogenic EGFR and c-Met. PNAS 2007. Anna Ritz (CS 6824) (VT)
Signaling Hypergraphs
March 3, 2014
Pathway Analysis of Gefitinib-Resistant and Gefitinib-Sensitive Cell Lines 1 CIN85
PIP5K1C
LRIG1
PIP5K1C
CBL[cy]
1
SH3GL2
CBL[pm]
EGFR
EGFR
EGF
EGF
1
TLN1 PIP5K1C
TLN1
1 CIN85
EPS15[cy]
CBL+[pm]
SH3GL2
[pm]
EGFR
EGFR
EGF
EGF
cell migration EPS15[pm]
SYNJ1
EPN1[pm] AMPH
GefitinibSensitive
DMN1
Anna Ritz (CS 6824) (VT)
GTP
EGFR
EGFR
EGF
EGF
[endosome]
Signaling Hypergraphs
Gefitinib-Resistant March 3, 2014