このページは http://www.slideshare.net/plotti/social-network-analysis-part-ii の内容を掲載しています。

掲載を希望されないスライド著者の方は、こちらよりご連絡下さい。

- Social Network Analysis

Fundamental Concepts in Social Network

Analysis (Part 2)

Katarina Stanoevska-Slabeva, Miriam Meckel, Thomas Plotkowiak - Agenda

1. Intro

2. Measuring Networks

– Embedding Measures (Ties)

– Positions and Roles (Nodes)

– Group Concepts

3. Network Mechanisms

4. Network Theories

© Thomas Plotkowiak 2010 - Introduction

Knoke information exchange network

In 1978, Knoke & Wood

collected data from

workers at 95 organizations in

Indianapolis. Respondents

indicated with which other

organizations their own

organization had any of 13

different types of relationships.

The exchange of information

among ten organizations that

were involved in the local

political economy of social

welfare services in a Midwestern

city.

© Thomas Plotkowiak 2010 - 2. Network Measures

2.1 Network Measures for Actors

Embedding Measures - Embedding Measures

•

Reciprocity (Dyad Census)

•

Transitivity (Triad Census)

•

Clustering

•

Density

•

Group-external and group-internal Ties

•

Other Network Mechanisms

© Thomas Plotkowiak 2010 - Reciprocity

•

With symmetric data two actors are either connected or not.

•

With directed data there are four possible dyadic

relationships:

– A and B are not connected

– A sends to B

– B sends to A

– A and B send to each other.

© Thomas Plotkowiak 2010 - Reciprocity II

•

What is the reciprocity in this network?

– Answer 1: % of pairs that have reciprocated ties / all possible pairs

• AB of {AB,AC,BC} = 0.33

– Answer2: % of pairs that have reciprocated ties / existing pairs

• AB of {AB,BC} = 0.5

– Answer 3: % directed ties / all directed ties

• {AB,BA} of {AB, BA, AC, CA, BC, CA} = 0.33

© Thomas Plotkowiak 2010 - Transitivity

•

With undirected there are four possible types of triadic relations

– No ties

– One tie

– Two Ties

– Three Ties

•

The count of the relative prevalence of these four types of relations is

called "triad census“. A population can be characterized by:

– "isolation"

– "couples only"

– "structural holes" (one actors is connected to two others, who are not

connected to each other)

– or "clusters"

© Thomas Plotkowiak 2010 - Transitivity II

Directed Networks

M-A-N number:

M # of mutual positive dyads

A #asymmetric dyads

N #of nul dyads

D =Down, U = Up, C = Cyclic, T= Transitive

© Thomas Plotkowiak 2010 - Triad Census Models

(all)

(all)

Linear Hierarchy Model

Every triad is 030T

(all)

(all)

(all)

Balance Model with Two Cliques

(Heider Balance)

Triads either 300 or 102

Ranked Clusters Model (Hierarchy of Cliques)

Triads: 300, 102, 003, 120D, 120U, 030T, 021D, 021U

© Thomas Plotkowiak 2010 - Example

Directed information exchange network

9

8

7

6

1

3

5

10

2

4

The exchange of information among ten organizations that were involved in the local

political economy of social welfare services in a Midwestern city.

© Thomas Plotkowiak 2010 - Transitivity III

A

1

3

B

C

2

•

How to measure transitivity?

– A) Divide the number of found transitive triads by the total number of

possible triplets (for 3 nodes there are 6 possibilities)

– B) Norm the number of transitive triads by the number of cases where

a single link could complete the triad.

Norm {AB, BC, AC} by {AB, BC, anything)

(for 3 nodes there are 4 possibilities)

© Thomas Plotkowiak 2010 - Transitivity IV

146/720

146/217

© Thomas Plotkowiak 2010 - Clustering

Most actors live in local neighborhoods and are connected to one

another. A large proportion of the total number of ties is highly

"clustered" into local neighborhoods.

VS.

© Thomas Plotkowiak 2010 - Global clustering coefficient

Closed triplet

Triplet

© Thomas Plotkowiak 2010 - Average Local Clustering coefficient

A measure to calculate how clustered the graph is we examine the local

neighborhood of an actor (all actors who are directly connected to ego) and

calculate the density in this neighborhood (leaving out the ego). After doing

this for all actors, we can characterize the degree of clustering as an average of

all the neighborhoods.

C = 1

C = 1/3

C = 0

© Thomas Plotkowiak 2010 - Individual local clustering coefficient

(in this case for directed ties)

Clustering can also be examined for each actor:

– Notice actor 6 has three neighbors and hence only 3 possible ties. Of

these only one is present, so actor 6 is not highly clustered.

– Actor eight has 6 neighbors and hence 15 pairs of neighbors and is

highly clustered.

2 edges out of 6

edges

© Thomas Plotkowiak 2010 - Density for groups

Instead of calculating the density of the whole network (last

lecture), we can calculate the density of partitions of the network.

Governmental agencies

Non-governmental generalist

Welfare specialists

A social structure in which individuals were highly clustered

would display a pattern of high densities on the diagonal, and

low densities elsewhere.

© Thomas Plotkowiak 2010 - Density for groups II

•

Group 1 has dense in and out ties to one another and to the

other populations

•

Group 2 have out-ties among themselves and with group 1

and have high densities of in-ties with all three sub populations

The density in the 1,1 block is .6667.That is, of

the six possible directed ties among actors 1, 3,

and 5, four are actually present

The extend of how those blocks characterize al the

individuals within those blocks can be assessed by

looking at the standard deviations. The standard

deviations measure the lack of homogeneity within

the partition, or the extent to which the actors vary.

© Thomas Plotkowiak 2010 - E-I Index

•

The E-I (external – internal) index takes the number of

ties of group members to outsiders, subtracts the number of

ties to other group members, and divides by the total number

of ties.

(1-4)/7 = -3/7

(1-2)/7 = -1/7

© Thomas Plotkowiak 2010 - E-I Index II

•

The resulting E-I index ranges from -1 (all ties internal) to +1

(al ties external). Ties between members of the same group

are ignored.

•

The E-I index can be applied at three levels:

– entire population

– each group

– each individual

Notice: The relative size of sub populations (e.g. 10 vs. 1000) have dramatic

consequences for the degree of internal and external contacts, even when

individuals may choose contacts at random.

© Thomas Plotkowiak 2010 - E-I Index for groups

Notice that the data has

been symmetrized

© Thomas Plotkowiak 2010 - E-I Index for the entire population

Notice that the data has

been symmetrized

Internal: 7*2/64 = 21%

External 25*2/64 = 70%

E-I (50-14)/64 = 56%

© Thomas Plotkowiak 2010 - Permutation Tests

To assess whether the E-I index value is significantly different that

what would be expected by random mixing a permutation test is

performed.

Notice: Under random distribution, the E-I Index would be expected to have a

value of .467 which is not much different from .563, especially given the standard

error .078 (given the result the difference of .10 could be just by chance)

© Thomas Plotkowiak 2010 - E-I Index for individuals

Notice: Several actors (4,6,9) tend toward closure , while

others (10,1) tend toward creating ties outside their groups.

© Thomas Plotkowiak 2010 - 2. Network Measures

2.2 Network Measures for Actors

Position & Roles - Positions & Roles

•

Structural Equivalence

•

Automorphic Equivalence

•

Regular Equivalence

•

Measuring similarity/dissimilarity

•

Visualizing similarity and distance

•

Measuring automorphic equivalence

•

Measuring regular equivalence

•

Blockmodel ing

© Thomas Plotkowiak 2010 - Chinese Kinship Relations

© Thomas Plotkowiak 2010 - Positions and Roles

•

Positions: Actors that show a similar structure of relationships

and are thus similarly embedded into the network.

•

Roles: The pattern of relationships of members of same or

different positions.

•

Note: Many of the category systems used by sociologists are

based on "attributes" of individual actors that are common

across actors.

© Thomas Plotkowiak 2010 - Similarity

•

The idea of "similarity" has to be rather precisely defined

•

Nodes are similar if they fall in the same "equivalence class"

– We could come up with a equivalence class of out-degree of zero for

example

•

There are three particular definitions of equivalence:

– Strucutral Equivalence

– Automorphic Equivalence (rarely used)

– Regular Equivalence

© Thomas Plotkowiak 2010 - Strucutral Equivalence

•

Structural Equivalence: Two structural equivalent actors could

exchange their positions in a network without changing their

connections to the other actors in the network.

•

Structural equivalence is the "strongest" form of equivalence.

•

Problem: Imagine two teachers in Toronto and St. Gallen.

Rather than looking for connections to exactly the same

persons we would like to find connection to similar persons

but not exactly the same ones.

© Thomas Plotkowiak 2010 - Automorphic Equivalence

•

Automorphic Equivalence: Two persons could change their

positions in the network, without changing the structure of

the network (Notice that after the exchange they would be

partially connected to other persons than before)

•

Problem: How big do we have to define the radius in which

we analyze the structure of the network (1, 2, 3 … steps)

•

For the One-Step Radius we consider the NUMBER of:

– asymetric outgoing,

– asymetric incoming,

– symetric in- and outgoing,

– and not existing ties.

© Thomas Plotkowiak 2010 - 1 Step, 2 Step Equivalence

?

1

2

© Thomas Plotkowiak 2010 - Regular Equivalence

•

Regular Equivalence: Two positions are considered as similar,

if every important Aspect of the observed structure applies

(or does not apply)for both positions.

•

For the One-Step Radius we consider the EXISTENCE of :

– asymetric outgoing,

– asymetric incoming,

– symetric in- and outgoing,

– and not existing ties.

© Thomas Plotkowiak 2010 - 1

A

B and C are

B

C

regular equivalent

D

E

F

G

H

2

3

A

A

B and C are

B

C

B and C are

B

C

automorph

structural

equivalent

equivalent

D

E

F

G

H

D

E

F

G

H

© Thomas Plotkowiak 2010 - Computing Positional Similarity

Example Information exchange network

© Thomas Plotkowiak 2010 - Measuring Similarity

Adjacency Matrix

1 Coun 2 Comm 3 Educ 4 Indu 5 Mayr 6 WRO 7 News 8 UWay 9 Welf 10 West

1 Coun

---

1

0

0

1

0

1

0

1

0

2 Comm

1

---

1

1

1

0

1

1

1

0

3 Educ

0

1

---

1

1

1

1

0

0

1

4 Indu

1

1

0

---

1

0

1

0

0

0

5 Mayr

1

1

1

1

---

0

1

1

1

1

6 WRO

0

0

1

0

0

---

1

0

1

0

7 News

0

1

0

1

1

0

---

0

0

0

8 UWay

1

1

0

1

1

0

1

---

1

0

9 Welf

0

1

0

0

1

0

1

0

---

0

10 West

1

1

1

0

1

0

1

0

0

---

© Thomas Plotkowiak 2010 - Measuring Similarity

Concatenated Row & Colum View

1 Coun 2 Comm 3 Educ 4 Indu 5 Mayr 6 WRO 7 News 8 UWay 9 Welf 10 West

---

1

0

1

1

0

0

1

0

1

1

---

1

1

1

0

1

1

1

1

0

1

---

0

1

1

0

0

0

1

0

1

1

---

1

0

1

1

0

0

1

1

1

1

---

0

1

1

1

1

0

0

1

0

0

---

0

0

0

0

1

1

1

1

1

1

---

1

1

1

0

1

0

0

1

0

0

---

0

0

1

1

0

0

1

1

0

1

---

0

0

0

1

0

1

0

0

0

0

---

---

1

0

0

1

0

1

0

1

0

1

---

1

1

1

0

1

1

1

0

0

1

---

1

1

1

1

0

0

1

1

1

0

---

1

0

1

0

0

0

1

1

1

1

---

0

1

1

1

1

0

0

1

0

0

---

1

0

1

0

0

1

0

1

1

0

---

0

0

0

1

1

0

1

1

0

1

---

1

0

0

1

0

0

1

0

1

0

---

0

1

1

1

0

1

0

1

0

0

---

© Thomas Plotkowiak 2010 - Pearson correlation coefficients, covariances

and cross-products

•

Person correlation (ranges from -1 to +1) summarize pair-

wise structural equivalence.

© Thomas Plotkowiak 2010 - Pairwise Structural Equivalence

We can see, for example, that

9

node 1 and node 9 have

identical patterns of ties.

8

The Pearson correlation

measure does not pay

attention to the overal

7

prevalence of ties (the mean

6

of the row or column), and it

1

does not pay attention to

differences between actors in

3

5

the variances of their ties.

Often this is desirable to

10

focus only on the pattern,

2

4

rather than the mean and

variance as aspects of

similarity between actors.

© Thomas Plotkowiak 2010 - Euclidean squared distances

Euclidean or squared Euclidean distances are not sensitive to the

linearity of association and can be used with valued or binary

data.

Other similar measures

can be Jaccard or

hamming distance.

© Thomas Plotkowiak 2010 - Going from pairs to groups of structural

equivalence

It is often useful to examine the similarities or distances to try to

locate groupings of actors (that is, larger than a pair) who are

similar. By studying the bigger patterns of which groups of actors

are similar to which others, we may also gain some insight into

"what about" the actor's positions is most critical in making them

more similar or more distant.

In the next two sections we wil cover how multi-dimensional

scaling and hierarchical cluster analysis can be used to identify

patterns in actor-by-actor similarity/distance matrices.

Both of these tools are widely used in non-network analysis; there are large and

excel ent literatures on the many important complexities of using these methods. Our

goal here is just to provide just a very basic introduction.

© Thomas Plotkowiak 2010 - Hierarchical Clustering

•

Hierarchical Clustering:

– Initially places each case in its own cluster

– The two most similar cases are then combined

– This process is repeated until all cases are agglomerated into a single

cluster (once a case has been joined it is never re-classsified)

© Thomas Plotkowiak 2010 - Multi Dimensional Scaling

•

MDS represents the patterns of similarity or dissimilarity in

the profiles among the actors as a "map" in a multi-

dimensional space. This map lets us see how "close" actors are

and whether they "cluster".

– Stress is a measure of badness of fit

– The author has to determine the meaning of the dimensions

© Thomas Plotkowiak 2010 - Finding automorphic equivalence

(for binary data)

•

Brute Force Approach: All the nodes of a graph are

exchanged and the distances among all pairs of actors in the

new graph are compared to the original one. When the new

and the old graph have the same distances among nodes the

"swapping" that was done identified the automorphic position.

•

Brute Force is expensive (363880 Permutations! )

© Thomas Plotkowiak 2010 - Regular Equivalence

Block Matrix

Informal Definition: Two actors are regularly equivalent if they

have similar patterns of ties to equivalent others.

Problem: Each definition of each position depends on its relations

with other positions. Where to start?

Sender

Repeater

Receiver

© Thomas Plotkowiak 2010 - Regular Equivalence

Block Matrix Block Image

•

Create a matrix so that each actor in each partition has the

same pattern of connection to actors in the other partition.

– Notice: We don’t care about ties among members of the same regular

class!

– A sends to {BCD} but none of {EFGHI}

– {BCD} does not send to A but to {EFGHI}

– {EFGHI} does not send to A or {BCD}

A B C D E F G H I

A

--- 1 1 1 0 0 0 0 0

B

0 --- 0 0 1 1 0 0 0

C

0 0 --- 0 0 0 1 0 0

A B,C,D E,F,G,H,I

D

0 0 0 --- 0 0 0 1 1

A

---

1

0

E

0 0 0 0 --- 0 0 0 0

B,C,D

0

---

1

F

0 0 0 0 0 --- 0 0 0

G

0 0 0 0 0 0 --- 0 0

E,F,G,H,I 0

0

---

H

0 0 0 0 0 0 0 --- 0

I

0 0 0 0 0 0 0 0 ---

© Thomas Plotkowiak 2010 - Algorithms for detection of Regular Equivalence

Tabu Search

•

This method of blocking and relies on extensive use of the

computer. Tabu search is trying to implement the same idea of

grouping together actors who are most similar into a block.

•

Tabu search does this by searching for sets of actors who, if

placed into a blocks, produce the smallest sum of within-block

variances in the tie profiles.

•

If actors in a block have similar ties, their variance around the

block mean profile will be small.

•

So, the partitioning that minimizes the sum of within block

variances is minimizing the overall variance in tie profiles

© Thomas Plotkowiak 2010 - Algorithms for detection of Regular Equivalence

Tabu Search Results

9

(2,5) for example,

are pure "repeaters"

8

7

6

1

3

5

10

2

4

The set { 6, 10, 3 } send to only two other types (not all three

other types) and receive from only one other type.

© Thomas Plotkowiak 2010 - Blockmodeling

Blockmodeling is able to include all kinds of equivalences into one

analysis

Examples of blocks:

• Complete blocks (everybody is connected with each other

inside the block)

• Nul blocks (people in this block are not connected to

anybody)

• Regular blocks, people share the same regular equivalence class

in this block

© Thomas Plotkowiak 2010 - Blockmodels

Matrix Permutation

© Thomas Plotkowiak 2010 - Blockmodels

Student Government. Discussion relation among the eleven students who were members of the student

government at the University of Ljubljana in Sloveninia. The students were asked to indicate with

whom of their fel ows they discussed matters concerning the administration of the university

informal y.

© Thomas Plotkowiak 2010 - General Blockmodel ing with predefined

partitions

© Thomas Plotkowiak 2010 - Blockmodeling based on actors-attributes

© Thomas Plotkowiak 2010 - Blockmodels

Matrix Representation

© Thomas Plotkowiak 2010 - Blockmodels

Matrix Permutation

© Thomas Plotkowiak 2010 - 2. Network Measures

2.2 Network Measures Subgroups

Cohesive Subgroups - Cohesive Subgroups

Cohesive subgroups: We hypothesize that cohesive subgroups

are the basis for solidarity, shared norms, identity and

collective behavior. Perceived similarity, for instance,

membership of a social group, is expected to promote

interaction. We expect similar people to interact a lot, at least

more often than with dissimilar people.

© Thomas Plotkowiak 2010 - Example – Families in Haciendas (1948)

Each arc represents "frequent visits" from one family to another.

© Thomas Plotkowiak 2010 - Components

A semiwalk from vertex u to vertex v is a sequence of lines such

that the end vertex of one line is the starting vertex of the next

line and the sequence starts at vertex u and end at vertex v.

A walk is a semiwalk with the additional condition that none of its

lines are an arc of which the end vertex is the arc's tail

Note that v5 v3 v4 v5 v3

is also a walk to v3

© Thomas Plotkowiak 2010 - Paths

A

semipath is a semiwalk in which no vertex in between the first

and last vertex of the semiwalk occurs more than once.

A path is a walk in which no vertex in between the first and last

vertex of the walk occurs more than once.

© Thomas Plotkowiak 2010 - Connectedness

A ne twork is (weakly) connected if each pair of vertices is

connected by a semipath.

A network is strongly connected if each pair of vertices is

connected by a path.

This network is not connected

because v2 is isolated.

© Thomas Plotkowiak 2010 - Connected Components

A (

weak) component is a maximal (weakly) connected

subnetwork.

A strong component is a maximal strongly connected

subnetwork.

v1,v3,v4,v5 are a weak component

v3,v4,v5 are a strong component

© Thomas Plotkowiak 2010 - Example Strong Components

1. Net > Components > {Strong, Weak}

© Thomas Plotkowiak 2010 - Cliques and Complete Subnetworks

A cli

que is a maximal complete subnetwork containing three

vertices or more. (cliques can overlap)

v2,v4,v5 is not a clique

v1,v6,v5 is a clique

v2,v3,v4,v5 is a clique

© Thomas Plotkowiak 2010 - n-Clique & n-Clan

n-Clique: Is a maximal complete subgraph, in the analyzed graph,

each node has maximally the distance n. A Clique is a n-Clique

with n=1.

n-Clan: Ist a maximal complete subgraph, where each node has

maximally the distance n in the resulting graph

2-Clique

2-

Clan

© Thomas Plotkowiak 2010 - n-Clans & n-Cliques

6

5

1

4

2

3

2-Clans: 123,234,345,456,561,612

2-Cliques: 123,234,345,456,561,612 and 135,246

© Thomas Plotkowiak 2010 - k-Plexes

k-Plex : A k-Plex is a maximal complete subgraph with gs nodes, in

which each node has at least connections with gs-k nodes.

6

5

1

4

2

3

2-Plexe:s 1234, 2345, 3456, 4561, 5612, 6123

In general k-Plexes are more robust than Cliques und Clans.

© Thomas Plotkowiak 2010 - Overview Subgroups

4

3

4

3

4

3

1

2

1

2

1

2

2 Components

1 Component

1 Component

2 2-Clans (341,412)

1 2-Clans (124)

2 2-Cliques (341,412)

1 2-Clique (124)

4

3

4

3

1 Component

1 Component

1 2-Clan (1234)

1 2-Clan (1234)

1 2-Clique (1234)

1 2-Clique (1234)

1 2-Plex (1234)

1 2-Plex (1234)

1

2

1

2

1 Clique

© Thomas Plotkowiak 2010 - Overview Groupconcep

ts

•

1-Clique, 1-Clan und 1-Plex are identical

•

A n-Clan is always included in a higher order n-Clique

Component

2-C

lique

2-Clan

2-Pl e x

C

lique

© Thomas Plotkowiak 2010 - k-Cores

A •

k- N

coret

>

is Com

a m p

axion

m e

al n

ts

s > {

ubne Sttr

w on

or g,

k W

in eak}

wh

ich each vertex has at

least degree k within the subnetwork.

© Thomas Plotkowiak 2010 - k-Cores

k-cores are nested which means that a vertex in a 3-core is also

part of a 2-core but not all members of a 2-core belong to a 3-

core.

© Thomas Plotkowiak 2010 - k-Cores Application

•

K-cores help to detect cohesive subgroups by removing the

lowes k-cores from the network until the network breaks up

into relatively dense components.

•

Net > Partitions > Core >{Input, Output, All}

© Thomas Plotkowiak 2010 - © Thomas Plotkowiak 2010
- 3. Network Mechanisms
- Network Mechanisms

•

Tie Outdegree Effect

•

In/Out Popularity Effect

•

Reciprocity

•

In/Out Activity Effect

•

Transitivity

•

In/Out Assortativity Effect

& Three-Cycles Effect

•

Covariate Similarity Effect

•

Balance Effect

•

Covariate Ego-Effect

•

Covariate Alter-Effect

•

Same Covariate Effect

© Thomas Plotkowiak 2010 - Outdegree Effect

•

The most basic effect is defined by the outdegree of actor i. It

represents the basic tendency to have ties at all,

•

In a decision-theoretic approach this effect can be regarded as

the balance of benefits and costs of an arbitrary tie.

• Most networks are sparse (i.e., they have a density well below 0.5)

which can be represented by saying that for a tie to an arbitrary other

actor – arbitrary meaning here that the other actor has no

characteristics or tie pattern making him/her especially attractive to i –,

the costs will usually outweigh the benefits. Indeed, in most cases a

negative parameter is obtained for the outdegree effect.

© Thomas Plotkowiak 2010 - Reciprocity Effect

•

Another quite basic effect is the tendency toward reciprocity,

represented by the number of reciprocated ties of actor i. This

is a basic feature of most social networks (cf. Wasserman and

Faust, 1994, Chapter 13)

i

j

© Thomas Plotkowiak 2010 - Transitivity and other triadic effects

•

Next to reciprocity, an essential feature in most social

networks is the tendency toward transitivity, or transitive

closure (sometimes cal ed clustering): friends of friends

become friends, or in graph-theoretic terminology: two-paths

tend to be, or to become, closed (e.g., Davis 1970, Holland

and Leinhardt 1971).

j

j

i

i

h

h

Transitive triplet

Three cycle

© Thomas Plotkowiak 2010 - Balance Effect

•

An effect closely related to transitivity is balance (Newcomb,

1962), which is the same as structural equivalence with

respect to out-ties (Burt, 1982), is the tendency to have and

create ties to other actors who make the same choices as

ego.

A

D

B

C

© Thomas Plotkowiak 2010 - In/Out Popularity Effect

•

The degree-related popularity effect is based on indegree or

outdegree of an actor. Nodes with higher indegree, or higher

outdegree, are more attractive for others to send a tie to.

•

That implies that high indegrees reinforce themselves, which

wil lead to a relatively high dispersion of the indegrees (a

Matthew effect in popularity as measured by indegrees, cf.

Merton, 1968 and Price, 1976).

A

B

C

D

© Thomas Plotkowiak 2010 - In/Out Activity Effect

•

Nodes with higher indegree, or higher outdegree respectively,

wil have an extra propensity to form ties to others.

•

The outdegree-related activity effect again is a self-reinforcing

effect: when it has a positive parameter, the dispersion of

outdegrees wil tend to increase over time, or to be sustained

if it already is high.

A

B

C

D

© Thomas Plotkowiak 2010 - Preferential Attachment

•

Notice: These four degree-related effects can be regarded as

the analogues in the case of directed relations of what was

called cumulative advantage by Price (1976) and preferential

attachment by Barabasi and Albert (1999) in their models for

dynamics of non-directed networks: a self-reinforcing process

of degree differentiation.

© Thomas Plotkowiak 2010 - In/Out Assortativity Effect

•

Preferences of actors dependent on their degrees. Depending

on their own out- and in-degrees, actors can have differential

preferences for ties to others with also high or low out- and

in-degrees (Morris and Kretzschmar 1995; Newman 2002)

A

D

B

C

E

F

© Thomas Plotkowiak 2010 - Covariate Similarity Effect

•

The covariate similarity effect, describes whether ties tend to

occur more often between actors with similar values on a

value (homophily effect). Tendencies to homophily constitute

a fundamental characteristic of many social relations, see

McPherson, Smith-Lovin, and Cook (2001).

•

Example: Ipad Owners tend to be friends with other Ipad

owners.

© Thomas Plotkowiak 2010 - Covariate Ego Effect

•

The covariate ego effect, describes that actors with higher

values on a covariate tend to nominate more friends and

hence have a higher outdegree.

•

Example: Heavier smokers have more friends.

© Thomas Plotkowiak 2010 - Covariate Alter Effect

•

The alter effect describes whether actors with higher V values

will tend to be nominated by more others and hence have

higher indegrees.

•

Example: Beautiful people have more friends.

© Thomas Plotkowiak 2010 - Modeling networks

1. Actor Based modeling for longitudonal data

– SIENA (analysis of repeated measures on social networks and MCMC-

estimation of exponential random graphs)

2. Stochastic modeling for panels

– Pnet

objective function

Model 1

Model 2

Model3

esti

estim s.e.

p

estim s.e

p

s.e.

p

m

outdegree (density)

-2,46

0,12

<0,0001*

-4,04

0,23

<0,0001* -1,99 0,13 <0,0001*

reciprocity

2,57

0,20

<0,0001*

2,29

0,22

<0,0001*

3,02 0,21 <0,0001*

transitive triplets

0,07

0,01

<0,0001*

transitive mediated triplets

-0,03

0,01

0,0005*

transitive ties

1,47

0,24

<0,0001*

3-cycles

-0,06

0,02

0,0037*

attribute party

1,13

0,15

<0,0001*

0,73

0,15

<0,0001*

attribute gender

-0,11 0,15

0.48

© Thomas Plotkowiak 2010 - 3. Network Theories

Homophily & Assortativity

Power Laws & Preferential Attachment

The Strength of Weak Ties

Small Worlds

Social Capital - 3.1Homophily
- Homophily

•

Homophily (i.e., love of the same) is the tendency of

individuals to associate and bond with similar others.

(Mechanisms of selection vs influence)

•

In the study of networks, assortative mixing is a bias in favor of

connections between network nodes with similar characteristics. In the

specific case of social networks, assortative mixing is also known as

homophily. The rarer disassortative mixing is a bias in favor of connections

between dissimilar nodes.

Low Homophily

High Homophily

© Thomas Plotkowiak 2010 - Homophily II

Types (acc. to McPherson et. Al 2001):

– Race and Ethnicity (Marsden 1987, 88| Louch 2000, Kalleberg et al

1996, Laumann 1973…)

– Sex and Gender (Maccoby 1998, Eder & Hallinan 1978, Shrum et al

1988, Huckfeldt & Sprague 1995, Brass 1985 …)

– Age (Fischer 1977,82, Feld 1982, Blau et Al 1991, Burt 1990,91…)

– Religion (Laumann 1973, Verbrugge 1977, Fischer 1977,82, Marsden

1988, Louch 2000…)

– Education, Occupation and Social Class (Laumann 1973, Marsden 1987,

Verbrugge 1977, Wright 1997, Kalmijn 1998…)

– Network Positions (Brass 1985, Burt 1982, Friedkin 1993…)

– Behavior (Cohen 1977, Kandel 1978, Knocke 1990…)

– Attitudes, Abilities, Beliefs and Aspirations (Jussim & Osgood 1989,

Huckfeldt & Sprague 1995, Verbrugge 1977,83, Knocke 1990)

© Thomas Plotkowiak 2010 - Schellings Segregation Demo

© Thomas Plotkowiak 2010 - 3.2 Power Laws & Preferential

Attachment - Power Law distribution

•

As a function of k, what fraction of pages on the Web have k

in-links?

•

A natural guess the normal, or Gaussian, distribution

• Central Limit Theorem (roughly): if we take any sequence of

smal independent random quantities, then in the limit their sum

will be distributed according to the normal distribution

© Thomas Plotkowiak 2010 - Power Law distribution

But when people measured the Web, they found something

very different: The fraction of Web pages that have k in-links is approximately

proportional to 1/k^2

• Power law function

• Popularity exhibits extreme imbalances: there are few very popular Web

pages that have extremely many in-links

True for other domains:

• the fraction of telephone numbers that receive k cal s per day: 1/k^2

• the fraction of books bought by k people: 1/k^3

• the fraction of scientific papers that receive k citations: 1/k^3

© Thomas Plotkowiak 2010 - Preferential attachment leads to power laws

•

A preferential attachment process is any of a class of

processes in which some quantity, typically some form of

wealth or credit, is distributed among a number of individuals

or objects according to how much they already have, so that

those who are already wealthy receive more than those who

are not. Notice: "Preferential attachment" (A.L. Barabasi and

R.Albert 1999) is only the most recent of many names that

have been given to such processes.

•

Notice: Preferential attachment can, under suitable

circumstances, generate power law distributions.

© Thomas Plotkowiak 2010 - 3.3 Balance Theory
- Balance Theory

Franz Heider

Franz Heider (1940): A person (P) feels uncomfortable whe he

ore she disagrees with his ore her friend(O) on a topic (X).

P feels an urge to change this imbalance. He can adjust his

opinion, change his affection for O, or convince himself that O is

not really opposed to X.

© Thomas Plotkowiak 2010 - Balance Theory

(a) + + + : three people are

mutual friends

(c) - - + : two people are friends,

and they have mutual enemy in the

third

(b) + + - : A is a friend with B and

C; but B and C – enemies

(d) - - - : all enemies; motivates two

of them to “team up” against the

third

b and d represent unstable

relationship

© Thomas Plotkowiak 2010 - Balance Theory

Community in a New England Monastery

Young Turks (1), Loyal Opposition (2), Outcasts (3) Interstitial Group (4)

© Thomas Plotkowiak 2010 - Balance Theory

International Relations

© Thomas Plotkowiak 2010 - 3.4 Strength of weak ties
- Strength of Weak Ties

Mark Granovetter

•

“One of the most influential sociology papers ever

written” (Barabasi)

– One of the most cited (Current Contents, 1986)

•

Accepted by the American Journal of Sociology after

4 years of unsuccessful attempts elsewhere.

•

Interviewed people and asked: “How did you find

your job?”

– Kept getting the same answer: “through an acquaintance,

not a friend”

© Thomas Plotkowiak 2010 - Basic Argument

•

Classify interpersonal relations as “strong”, “weak”, or “absent”

•

Strength is (vaguely) defined as “a (probably linear)

combination of…

– the amount of time,

– the emotional intensity,

– the intimacy (mutual confiding),

– and the reciprocal services which characterize the tie

•

The stronger the tie between two individuals, the larger the

proportion of people to which they are both tied (weakly or

strongly)

© Thomas Plotkowiak 2010 - Strong Ties

•

If person A has a strong tie to both B and C, then it is unlikely

for B and C not to share a tie.

A

B

C

© Thomas Plotkowiak 2010 - Weak Ties for Information Diffusion

„Intuitively speaking, this means that

whatever is to be diffused can reach a

larger number of people, and traverse

greater social distance, when passed

through weak ties rather than strong.“

© Thomas Plotkowiak 2010 - 3.4 Smal World Phenomenon
- Connectivity and the Small World

1. Travers and Milgram’s work on the small world is responsible

for the standard belief that “everyone is connected by a chain

of about 6 steps.”

2. Two questions:

– Given what we know about networks, what is the longest path (defined

by handshakes) that separates any two people?

– Is 6 steps a long distance or a short distance?

© Thomas Plotkowiak 2010 - Example: Two Hermits on opposite sites of the

country

OH

Store

Hermit

Owner

Truck

Manager

Driver

Corporate

Corporate

Manager

President

Congress

Congress

Rep.

Rep.

Corporate

Corporate

President

Manager

Truck

Manager

Driver

Store

Mt.

Owner

Hermit

© Thomas Plotkowiak 2010 - Milgrams Test

Milgram’s test: Send a packet from sets of randomly selected

people to a stockbroker in Boston.

Experimental Setup: Arbitrarily select people from 3 pools:

– People in Boston

– Random in Nebraska

– Stockholders in Nebraska

© Thomas Plotkowiak 2010 - Results

•

Most chains found their

way through a small

number of

intermediaries.

•

What do these two

findings tel us of the

global structure of social

relations?

© Thomas Plotkowiak 2010 - Results II

1. Social networks contains a lot of short paths

2. People acting without any sort of global ‘map’ are effective at

collectively ﬁnding these short paths

© Thomas Plotkowiak 2010 - The Watts-Strogatz model

•

Two main principles explaining short paths: homophily and

weak ties:

• Homophily: every node forms a link to all other nodes that lie within a

radius of r grid steps

• Weak ties: each nodes forms a link to k other random nodes

•

Suppose, everyone lives on a two-dimensional grid (as a

model of geographic proximity)

© Thomas Plotkowiak 2010 - Watts-Strogatz

© Thomas Plotkowiak 2010 - The Watts-Strogatz model

•

Suppose, we only allow one out of k nodes to a to have a

single random friend

•

k * k square has k random links - consider it as a single node

•

Surprising small amount of randomness is enough to make

the world “small” with short paths between every pair of

nodes

© Thomas Plotkowiak 2010 - Decentralized Search

•

People are able to collectively ﬁnd short paths to the

designated target while they don’t know the global ‘map’ of all

connections

•

Breadth-ﬁrst search vs. tunneling

•

Modeling:

– Can we construct a network where decentralized search succeeds?

– If yes, what are the qualitative properties of such a network?

© Thomas Plotkowiak 2010 - A model for decentralized search

•

A starting node s is given a message that it must forward to a

target node t

•

s knows only the location of t on the grid, but s doesn’t know

the edges out of any other node

•

Model must span all the intermediate ranges of scale as well

© Thomas Plotkowiak 2010 - Modeling the process of decentralized search

•

We adapt the model by introducing clustering exponent q

• For two nodes v and w, d(v,w) - the number of steps between them

•

Random edges now generated with probability proportional

to d(v,w)-q

•

Model changes with different values q:

– q=0 : links are chosen uniformly at random

– when q is very small : long-range links are “too random”

– when q is large: long-range links are “not random enough”

© Thomas Plotkowiak 2010 - Varying clustering exponent

© Thomas Plotkowiak 2010 - Decentralized Search when q=2

Experiments show that decentralized search is more efficient

when q=2 (random links follow inverse-square distribution)

© Thomas Plotkowiak 2010 - What’s special about q=2

•

Since area in the plane grows like the square of the radius, the

total number of nodes in this group is proportional to d2

•

the probability that a random edge links into some node in

this ring is approximately independent of the value of d.

•

long-range weak ties are being formed in a way that’s spread

roughly uniformly over all different scales of resolution

Think of the postal

system: country, state,

city, street, and ﬁnal y

the street number

© Thomas Plotkowiak 2010 - Smal -World Phenomenon

Conclusions I

1. Start from a Milgram’s experiment: (1) seems there are short

paths and (2) people know how to ﬁnd them effectively

2. Build mathematical models for (1) and (2)

3. Make a prediction based on the models: clustering exponent

q=2

4. Validate this prediction using real data from large social

networks (LiveJournal, Facebook)

Why do social networks arrange themselves in a pattern of

friendships across distance that is close to optimal for forwarding

messages to far-off targets?

© Thomas Plotkowiak 2010 - Smal -World Phenomenon

Conclusions II

•

If there are dynamic forces or selective pressures driving the

network toward this shape, they must be more implicit, and it

remains a fascinating open problem to determine whether

such forces exist and how they might operate.

•

Robustness, Search, Spread of disease, opinion formation,

spread of computer viruses, gossip,…

•

For example: Diseases move more slowly in highly clustered

graphs

•

The dynamics are very non-linear -- with no clear pattern

based on local connectivity.

Implication: small local changes (shortcuts) can have dramatic

global outcomes (think of disease diffusion)

© Thomas Plotkowiak 2010 - Smal World Construction

•

Network changes from structured to random

•

Given 6 Bil ion Nodes L starts at 3 million, decreases to 4 (!)

•

Clustering: starts at 0.75, decreases to zero

•

Most important is what happens ALONG the way.

© Thomas Plotkowiak 2010 - Interactive Summary

The biggest advantage I can gain by using SNA is…

The most important fact about SNA for me is…

The concept that made the most sense for me today was…

The biggest danger in using SNA is …

If I will use SNA in the future, I will try to make sure that…

If I use SNA in my next project I will use it for …

I should change my perspective on networks in considering …

I have changed my opinion about SNA , finding out that…

I missed today that …

Before attending that seminar I didn't know that …

I wish we could have covered…

If I forget mostly everything that learned today, I will still remember …

The most important thing today for me was …

© Thomas Plotkowiak 2010 - Thanks for your attention!

Questions & Discussion