Social Network Analysis

Social Network Analysis Basic Concepts, Methods & Theory University of Cologne Johannes Putzke Folie: 1 Agenda     Introduction Basic Concepts ...
Author: Morgan Barton
11 downloads 1 Views 2MB Size
Social Network Analysis Basic Concepts, Methods & Theory University of Cologne Johannes Putzke Folie: 1

Agenda    

Introduction Basic Concepts Mathematical Notation Network Statistics

2

Textbooks  Hanneman & Riddle (2005) Introduction to Social Network Methods, available at http://faculty.ucr.edu/~hanneman/nettext/  Wasserman & Faust (1994): Social Network Analysis – Methods and Applications, Cambridge: Cambridge University Press.

Folie: 3

Introduction

Folie: 4

Basic Concepts

What is a network?

University of Cologne Johannes Putzke

Folie: 5

What is a Network?  Actors / nodes / vertices / points  Ties / edges / arcs / lines / links

Folie: 6

What is a Network?  Actors / nodes / vertices / points     

Computers / Telephones Persons / Employees Companies / Business Units Articles / Books Can have properties (attributes)

 Ties / edges / arcs / lines / links

Folie: 7

What is a Network?  Actors / nodes / vertices / points  Ties / edges / arcs / lines / links  connect pair of actors  types of social relations      

friendship acquaintance kinship advice hindrance sex

 allow different kind of flows  messages  money  diseases

What is a Social Network? - Relations among People

Rob

John Steve

Paul

Mike Kai

Kim

Stanley

Peter

Lee

Patrick

Johannes George

Juan

Ken

Folie: 9

Homer

What is a Network? - Relations among Institutions  as institutions

23% 22%

 owned by, have partnership / joint venture  purchases from, sells to  competes with, supports

16% 51%

8% 12%

10%

51% 100% 72%

32%

 through stakeholders

14% 16% 7%

7%

27% 9%

100%

9%

6%

42%

13%

15% Image by MIT OpenCourseWare.

Folie: 10

 board interlocks  Previously worked for

Why study social networks?

Folie: 11

Example 2) Homophily Theory  I I

Male

Female

Male

123

68

Female

95

164

I  Birds of a feather flock together  See McPherson, Smith-Lovin & Cook (2001)

0-13

14-29 30-44 45-65 >65

212

63

117

72

91

14-29 83

372

75

67

84

30-44 105

98

321

214

117

45-65 62

72

232

412

148

>65

77

124

153

366

0-13

90

 age / gender  network

Folie: 12

Managerial Relevance – Social Network… Shaneeka Stock

Juan

Jose

Nichole

Mitch Callie

Andy

Bill Ashley

Sean

Ashton Alisha

Brandy

Burke

Pamela

Jody

Folie: 13

Cody

Ben Ewelina

Image by MIT OpenCourseWare.

…vs. Organigram Exploration & Production Senior Vice President Burke Exploration Cody G&G

Mitch

Shaneeka

Drilling Ben

Petrophysical Ashley

Andy

Production Shaneeka

Ewelina

Bill

Brandy

Ashton

Sean

Stock

Production Stock Reservoir Juan

Juan

Jose

Jose Mitch Callie

Pamela

Andy

Bill

Jody

Burke

Nichole

Ashley

Sean

Ashton

Nichole

Cody

Alisha Ben

Alisha Brandy

Callie

Pamela

Jody

Ewelina

Image by MIT OpenCourseWare.

Folie: 14

Source: http://www.robcross.org/sna.htm

SNA – A Recent Trend in Social Sciences Research  Keyword search for„social“ + „network“ in 14 literature databases 8000

Abstracts

6000

4000

2000

Titles

0 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010

YEAR

Folie: 15

Source: Knoke, David (2007) Introduction to Social Network Analysis

Artikelanzahl

SNA – A Recent Trend in IS Research 200 180 160 140 120 100 80 60 40 20 0

EBSCO ACM ScienceDirect

Jahr

Folie: 16

How to analyze Social Networks?

Folie: 17

Example: Centrality Measures  Who is the most prominent?  Who knows the most actors? (Degree Centrality)  Who has the shortest distance to the other actors?  Who controls knowledge flows?  ...

Folie: 18

Example: Centrality Measures  Who is the most prominent?  Who knows the most actors?  Who has the shortest distance to the other actors? (Closesness Centrality)  Who controls knowledge flows?  ... Folie: 19

Example: Centrality Measures  Who is the most prominent?  Who knows the most actors  Who has the shortest distance to the other actors?  Who controls knowledge flows? (Betweenness Centrality)  ...

Folie: 20

Basic Concepts

Folie: 21

Dyads, Triads and Relations  actor  dyad

 triad

friendship

 relation:  collection of specific ties among members of a group

kinship

Folie: 22

Strength of a Tie  Social network

Anna, female, 27 Ken, male, 34

 finite set of actors and relation(s) defined on them  depicted in graph/ sociogram  labeled graph

 Strength of a Tie  dichotomous vs. valued  depicted in valued graph or signed graph (+/-)

5

2

Folie: 23

Strength of a Tie  Strength of a Tie  nondirectional vs. directional  depicted in directed graphs (digraphs)  nodes connected by arcs  3 isomorphism classes

adjacent node to/from incident node to

+

 null dyad  mutual / reciprocal / symmetrical dyad  asymmetric / antisymmetric dyad

-

 converse of a digraph  reverse direction of all arcs Folie: 24

Walks, Trails, Paths  (Directed) Walk (W)  sequence of nodes and lines starting and ending with (different) nodes (called origin and terminus)  Nodes and lines can be included more than once

 Inverse of a (directed) walk (W-1)  Walk in opposite order

 Length of a walk  How many lines occur in the walk? (same line counts double, in weighted graphs add line weights)

 (Directed) Trail  Is a walk in which all lines are distinct

 (Directed) Path  Walk in which all nodes and all lines are distinct

 Every path is a trail and every trail is a walk Folie: 25

Walks, Trails and Paths - Repetition l3 l2 l1

l5

l4

l7 l6

 W = n1 l1 n2 l2 n3 l4 n5 l6 n6

 Path  origin  terminus

 n1  n3

 W = n1 l1 n2 l2 n3 l4 n5 l4 n3  W = n1 l1 n2 l2 n3 l4 n5 l5 n4 l3 n3

 Walk  Trail

Folie: 26

University of Cologne Johannes Putzke

Reachability, Distances and Diameter  Reachability  If there is a path between nodes ni and nj

 Geodesic  Shortest path between two nodes

 (Geodesic) Distance d(i,j)  Length of Geodesic (also called „degrees of separation“)

Folie: 27

Mathematical Notation and Fundamentals

Folie: 28

Three different notational schemes 1. Graph theoretic 2. Sociometric 3. Algebraic

Folie: 29

1. Graph Theoretic Notation       

  

N Actors {n1, n2,…, ng} n1  nj there is a tie between the ordered pair n1  nj there is no tie (ni, nj) nondirectional relation directional relation g(g-1) number of ordered pairs in network g(g-1)/2 number of ordered pairs in nondirectional network L collection of ordered pairs with ties {l1, G graph descriped by sets (N, L) Simple graph has no reflexive ties, loops

Folie: 30



directional

l2,…, lg}

2. Sociometric Notation - From Graphs to (Adjacency/Socio)-Matrices III

IV

5

III

3

IV

5 4

II

I

V

VI

I

Binary, undirected I II III IV V VI -

1

II

1

-

symmetrical 1

II

1

-

1

1

IV

1

-

1

V

1

1 1

VI

4

V

1

VI

3

2

Valued, directed I II III IV V VI

I III

II

2 4

I

2

0

0

0

0

0

0

4

0

0

0

III 0

3

0

5

4

0

1

IV 0

0

5

0

0

0

-

1

V

0

0

0

2

0

3

1

-

VI 0

0

0

1

4

0

Folie: 31

2. Sociometric Notation  X g × g sociomatrix on a single relation g × g × R super-sociomatrix on R relations  XR sociomatrix on relation R

 Xij(r) value of tie from ni to ni (on relation χr) where i ≠ j

Folie: 32

2. Sociometric Notation – From Matrices to Adjacency Arc List Lists and Arc Lists

I

II III IV V VI

I

-

1

II

1

-

1

1

-

1

1

IV

1

-

1

1

V

1

1

-

1

1

1

-

III

VI

University of Cologne Johannes Putzke

Adjacency List I II III IV V VI

Folie: 33

II I III II IV V III V VI III V VI IV V

I II II I II III III II III IV III V IV III IV V IV VI V III V IV V VI VI IV VI IV

Network Statistics

Folie: 34

Different Levels of Analysis    

Actor-Level Dyad-Level Triad-Level Subset-level (cliques / subgraphs)  Group (i.e. global) level

Folie: 35

Measures at the Actor-Level: Measures of Prominence: Centrality and Prestige

Folie: 36

Degree Centrality  Who knows the most actors? (Degree Centrality)  Who has the shortest distance to the other actors?  Who controls knowledge flows?  ...

Folie: 37

Degree Centrality I III I

II I

I

 Indegree dI(ni)

IV

V

 Popularity, status, deference, degree prestige

VI

g

CDI  ni   d I (ni )   x ji  x i j

II III IV V VI

1  Outdegree dO(ni)  Expansiveness g 1 CDO  ni   do (ni )   xij  xi  3 j

1

0

0

0

0

0

0

1

0

0

0

III 0

1

0

1

1

0

IV 0

0

1

0

0

0

V

0

0

0

1

0

1

1  Total degree ≡ 2 x number of 2 edges

VI 0

0

0

1

0

1

2

0

2

2

3

1

2

II

Folie: 38

Marginals of adjacency matrix

Degree Centrality II  Interpretation: opportunity to (be) influence(d)  Classification of Nodes  Isolates II  dI(ni) = dO(ni) = 0

 Transmitters  dI(ni) = 0 and dO(ni) > 0

V

 Receivers

III

IV

 dI(ni) > 0 and dO(ni) = 0

 Carriers / Ordinaries  dI(ni) > 0 and dO(ni) > 0 

VI

I

VII

Standardization of CD to allow comparison across networks of different sizes: divide by ist maximum value

C (ni )  ' D

Folie: 39

d  ni  g 1

Closeness Centrality  Who knows the most actors?  Who has the shortest distance to the other actors? (Closesness Centrality)  Who controls knowledge flows?  ...

Folie: 40

Closeness Centrality III I

II

 Index of expected arrival time 1 CC (ni )  g  d ni , n j

IV

V

VI

I

II III IV V VI

I

-

1

2

3

3

4

13

II

1

-

1

2

2

3

9

III 2

1

-

1

1

2

7

IV 3

2

1

-

1

1

8

V

3

2

1

1

-

1

8

VI 4

3

2

1

1

-

11

Folie: 41

j 1





Reciprocal of marginals of geodesic distance matrix  Standardize by multiplying (g-1)  Problem: Only defined for connected graphs

Proximity Prestige PP (ni ) 

I i / ( g  1)

 d n , n  / I

1

2

g

j

j 1

i

4

3

i

5

7

6

 Ii/(g-1)  number of actors in the influence domain of ni  normed by maximum possible number of actors in influence domain

 Σd(nj,ni)/ Ii  average distance these actors are to ni 19

Folie: 42

8

9

11

12

14

13 15

18

10

16

17

Eccentricity / Association Number  Largest geodesic distance between a node and any other node  maxj d(i,j)

1

2 4

3

5 6

8

9

11

15 19

18

10

12

14

13

Folie: 43

7

16

17

Betweenness Centrality  Who knows the most actors?  Who has the shortest distance to the other actors?  Who controls knowledge flows? (Betweenness Centrality)  ...

1

2 4

3

5 6

8

9

11

15

Folie: 44

18

10

12

14

13

19

7

16

17

Betweenness Centrality 1

 How many geodesic linkings between two actors j and k contain actor i?

4

3  gjk(ni)/gjk probability that distinct actor ni „involved“ in communication between two actors nj and nk

CB (ni ) 

j k

5

8

9

i

11

g jk

 standardized by dividing through (g-1)(g-2)/2

15

Folie: 45

18

10

12

14

13

19

7

6

 g n  jk

2

16

17

Several other Centrality Measures  …beyond the scope of this lecture  Status or Rank Prestige, Eigenvector Centrality  also reflects status or prestige of people whom actor is linked to  Appropriate to identify hubs (actors adjacent to many peripheral nodes) and bridges (actors adjacent to few central actors)  attention: more common, different meaning of bridge!!!

 Information Centrality  see Wasserman & Faust (1994), p. 192 ff.  Random Walk Centrality  see Newman (2005)

Folie: 46

Condor – Betweenness Centrality

Folie: 47

(Actor) Contribution Index messa g es _ sen t messa g es _ received messa g es _ sen t messa g es _ received

Sender (+1)

links to external networks “Connector” “Gatekeeper”

coordinates and organizes tasks

Communicator Ambassador

Contribution index

Contribution frequency

Creator Guru Knowledge Expert

Receiver (-1)

Collaborator Expediter

serves as the ultimate source of explicit knowledge “Maven”

Folie: 48

provides the overall vision and guidance “Salesman”

Measures at the Group-(Global-)Level and Subgroup-Level

Folie: 49

Diameter of a Graph and Average Geodesic Distance

 Diameter

1

 Largest geodesic distance between any pair of nodes

2 4

3

5 6

 Average Geodesic Distance

8

 How fast can information get transmitted?

9

11

15 19

18

10

12

14

13

Folie: 50

7

16

17

Density  Proportion of ties in a graph

High density (44%)

Low density (14%)

Folie: 51

Density III

IV

III

3

IV

5 4

I

II

V

VI

I

II

4

V

In undirected graph:

Proportion of ties Folie: 52



1

VI

3

2

g

L L   g ( g -1) / 2  g    2

2 4

g

 x i 1 j 1

ij

g ( g -1)

In valued directed graph: Average strength of the arcs

Group Centralization I  How equal are the individual actors‘ centrality values?  CA(ni*) actor centrality index  CA(n*) maxiCA(ni*) g   CA  n*   CA  ni  sum of difference between largest value i 1 and observed values g  General centralization index: *   C n  C n      A A i   i 1 CA  g max  C A  n*   C A  ni   i 1

Folie: 53

Group Centralization II  

g

CD 

*   C n  C n    D D i   i 1

( g  1)( g  2)

 C  n   C  n  g

CC 

' C

i 1

*

' C

( g  1)( g  2) (2 g  3)

 C  n   C  n   C  n   C  n  g

g

*

CB 

i

i 1

B

B

( g  1) ( g  2) 2

Folie: 54

i



i 1

' B

*

( g  1)

' B

i

Condor – Group Centralization

Folie: 55

Subgroup Cohesion  average strength of ties within the subgroup divided by average strength of ties that are from subgroup members to outsiders  >1  ties in subgroup are stronger

x

iN s jN s

III

ij 3

g s ( g s  1)



iN s jN s

IV

5

4

xij

II I

gs ( g  gs ) Folie: 56

2

4

2

1

4

V

VI 3

Connectivity of Graphs and Cohesive Subgroups

Folie: 57

Connectivity of Graphs

Folie: 58

Connected Graphs, Components, Cutpoints and Bridges

 Connectedness  A graph is connected if there is a path between every pair of nodes

 Components  Connected subgraphs in a graph  Connected graph has 1 component  Two disconnected graphs are one social network!!! Folie: 59

Connected Graphs, Components, Cutpoints and Bridges n1

n2

n3

n4

n1

n2

n3

n4

n1

n2

n3

n4

n2

n3

n5

n6

n2

n3

n1

n1

 Connectivity of pairs of nodes and graphs  Weakly connected  Joined by semipath  Unilaterally connected  Path from nj to nj or from nj to nj  Strongly connected  Path from nj to nj and from nj to nj  Path may contain different nodes  Recursively Connected  Nodes are strongly connected and both paths use the same nodes and arcs in reverse order

n4

n4

Folie: 60

Connected Graphs, Components, Cutpoints and Bridges  Cutpoints

1

 number of components in the graph that contain node nj is fewer than number of components in subgraphs that results from deleting nj from the graph

2

4

3

5 6

8

9

 Cutsets (of size k)  k-node cut

11

 Bridges / line cuts  Number of components…that contain line lk

15

18

10

12

14

13

19 Folie: 61

7

16

17

Node- and Line Connectivity  How vulnerable is a graph to removal of nodes or lines?

Point connectivity / Node connectivity  Minimum number of k for which the graph has a knode cut  For any value 0 ?  two nodes can be connected by paths of length ≤ (g-1)

• Calculate X[Σ] = X + X2 + X3 + … + Xg-1 • X[Σ] shows total number of walks from ni to ni X2 III

I

II

IV

V

VI

Folie: 82

I

II

III

IV

V

VI

I

1

0

1

0

0

0

II

0

2

0

1

1

0

III

1

0

3

1

1

2

IV

0

1

1

3

2

1

V

0

1

1

2

3

1

VI

0

0

2

1

1

2

From Graphs to (Geodesic Distance)-Matrices (Reachability) – Geodesic Distance  observer power matrices  first power p for which the (i,j) element is non-zero gives the shortest path III IV  d(i,j) = minp xij[p] > 0

I

II

I

0

II

X

II

I I

II

I

1

0

II

1

0

0

1

1

1

0

1

X2

V

VI

III

IV

V

VI

0

1

0

0

0

0

2

0

1

1

0

III

1

0

3

1

1

2

1

IV

0

1

1

3

2

1

0

1

V

0

1

1

2

3

1

1

0

VI

0

0

2

1

1

2

III

IV

V

VI

1

0

0

0

0

1

0

1

0

0

III

0

1

0

1

IV

0

0

1

V

0

0

VI

0

0

Folie: 83

From Graphs to (Geodesic Distance)-Matrices (Reachability) – Geodesic Distance III II

I

IV

V

VI

Binary, undirected I II III IV V VI I

-

1

2

3

3

4

II

1

-

1

2

2

3

III 2

1

-

1

1

2

IV 3

2

1

-

1

1

V

3

2

1

1

-

1

VI 4

3

2

1

1

-

Folie: 84

MIT OpenCourseWare http://ocw.mit.edu

15.599 Workshop in IT: Collaborative Innovation Networks Fall 2011

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.