Social Network Analysis

class: center, middle, inverse, title-slide

# Social Network Analysis
## Network Data, Centrality, Transitivity, Clustering
### Paulo Serôdio
### 2019-05-20

---

#  Goals of this Seminar

+ Introduction to Core Social Network Concepts
	+ Overview of the field and the tools
	+ Mathematical foundations
	+ SNA Data & Survey Design
	+ Centrality
	+ Social Capital
	+ Cohesion
	+ Subgroups
	+ Equivalence (Role & Position)
	+ Hypotheses testing
+ Introduction to network analysis in R

---
#  Structure of the Course

+ **Monday** : Introduction, Global & Local Network properties
	+ Algebra
	+ Graph theory
	+ Network data
	+ Centrality & Centralization
	+ Transitivity & Clustering
+ **Tuesday** : Social Capital, Brokerage and Equivalences
+ **Wednesday** : Cohesion, hypothesis testing and inferential networks

---

# Objectives of the course

- Build intuition
- Expose key concepts
- Highlight big questions
- provide abstract examples
- Some pointers to other studies
- *NOT* a substitute for technical work

---
#  Introduction

+ Name
+ Affiliation
+ Discipline
+ SNA Experience/Knowledge
+ Phenomena of interest

---

---
# shortest path Amersham to Woolwich Arsenal

---

And people understand them intuitively

---

---

---

---
#  Growth in Multiple Areas

+ Pop Culture
	+ Kevin Bacon
	+ Online “social network sites”
+ Business Practitioners
	+ New consulting tools
	+ Knowledge management
+ Academics
	+ In multiple fields from communication to epidemiology
![](assets/img/image8.jpeg){width=10px}

---
#  What Defines SNA?

+ Phenomenon studied
	+ distinctive type of data
+ Perspective taken
	+ Perhaps one perspective, but multiple theories
+ Methodological toolkit
	+ new concepts, new tools

---
#  Reasoning about Networks

+ What can achieve from studying networks?
	+ Patterns and statistical properties of network data;
	+ Design principles and models;
	+ Understand the organisation of networks;
+ How can we reason about networks?
	+ **Empirical** : study data; measure and quantify;
	+ **Mathematical** Models: graph theory & stats, distinguish surprising from expected phenomena
	+ **Algorithms** : for hard computational challenges

---

# how mathematicians reason about networks

- Mathematicians are concerned with the abstract structure of a graph

- Mathematicians define operations to analyze and manipulate graphs. Moreover, they develop theorems based upon structural axioms.

---

# how physicists reason about networks

- Physicists are concerned with modeling real-world structures with networks.

- Physicists define algorithms that compress the information in a network to more simple values (e.g. statistical analysis).

---

<img src="assets/img/image12.png" width="100%" style="display: block; margin: auto;" />
---

# Much of the World has a Graphical/Network Structure

- **Social networks**: define how persons interact (collaborators, friends, kins).
- **Biological networks**: define how biological components interact (protein, food chains, genes).
- **Transportation networks**: define how cities are joined by air and road routes.
- **Dependency networks**: define how software modules use each other.
- **Communication networks** 
- **Language networks**: define the relationships between words.

---
#  History of SNA

+ 1736- Euler
+ 1930s- Sociometry
+ 1940s Psychologists
+ 1950s & 60s Anthropologists
+ 1970s Rise of Sociologists
	+ Small Worlds, Strength of weak ties
+ 1980s IBM computation
	+ Computer programs developed
+ 1990s Ideas spread
	+ UCINET released, spread of network analyis to multiple fields, social capital, embedded ties
+ 2000s Physicists jump on the bandwagon

---
#  What is a Network?

+ A set of dyadic ties all of the same type, among a set of actors ( nodes )
<img src="assets/img/image14.png" width="100%" style="display: block; margin: auto;" />
---
#  Popular Social Network Theories

+ Small World Phenomenon (6 degrees of separation)
+ Strength of Weak Ties (information diffusion; job market)
+ Embeddedness  (“What would economic life be like if people didn’t have social relationships?”)
+ Social Capital (cooperation and social networks have value)

---
#  Relations Matter

Attributes vs. Relations (Discovery of HIV: sexual contact among gay men with unusual cancer, traced by Darrow at the CDC)

<img src="assets/img/image15.png" width="50%" style="display: block; margin: auto;" />
---
# Structure Matters

Medieval trade in Russian rivers

![](assets/img/image17.jpeg)

---

![](assets/img/image18.jpeg)

---
#  Why Study Networks?

+ Prevent the spread of disease
+ Make the world a better place
+ Improve organizational effectiveness

---
#  Prevent the spread of disease

<img src="assets/img/image19.png" width="70%" style="display: block; margin: auto;" />
---
#  Improve Organizational Outcomes

![](assets/img/image20.png)

---
#  Make the World A Better Place

![](assets/img/image21.png)

---
#  Tools & Software

+ DISCLAIMER:
	+ This course focuses on **R** software and, if there is time, **Gephi** . They are not the only tools out there for social network analysis and visualization, but, they are very popular and have a nice balance of usability and capability

---
#  Other (beginner) Software tools

+ UCINET + NetDraw
	+ *The** social network analysis software
+ Domain specific
	+ SIENA (time series analysis of networks)
	+ Pajek (Better at computational analysis of really large networks)
	+ E-Net (analyzing ego networks)
	+ KeyPlayer (influencing or disrupting networks)

---
#  Defining & Describing a network

+ In social network analysis, we draw on two major areas of mathematics  regularly:
	+ **Matrix  Algebra**
		+ Tables of  numbers
		+ Operations on matrices enable us to draw conclusions  we couldn’t just  intuit
	+ **Graph  Theory**
		+ Branch of discrete math that deals with collections of  ties among nodes and gives us concepts like  paths

---
#  Network vs. Case Perspective

+ One of the biggest differences between the SNA perspective and more traditional social  science perspectives is the nature of the  data
	+ Instead of individual cases, where we collect the same information for a bunch of  people
	+ Here, we collect information about the interaction  of pairs of  people

---
#  Mainstream Logical Data Structure

+ 2-mode rectangular matrix in which rows (cases) are entities or  objects and columns  (variables) are  attributes of the  cases
+ Analysis consists of  correlating  columns
  + Emphasis on explaining  one  variable

<table class="table" style="font-size: 20px; margin-left: auto; margin-right: auto;">
 <thead>
  <tr>
   <th style="text-align:left;"> ID </th>
   <th style="text-align:left;"> Age </th>
   <th style="text-align:left;"> Education </th>
   <th style="text-align:left;"> Salary </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> 1 </td>
   <td style="text-align:left;">  </td>
   <td style="text-align:left;">  </td>
   <td style="text-align:left;">  </td>
  </tr>
  <tr>
   <td style="text-align:left;"> 2 </td>
   <td style="text-align:left;">  </td>
   <td style="text-align:left;">  </td>
   <td style="text-align:left;">  </td>
  </tr>
  <tr>
   <td style="text-align:left;"> 3 </td>
   <td style="text-align:left;">  </td>
   <td style="text-align:left;">  </td>
   <td style="text-align:left;">  </td>
  </tr>
  <tr>
   <td style="text-align:left;"> 4 </td>
   <td style="text-align:left;">  </td>
   <td style="text-align:left;">  </td>
   <td style="text-align:left;">  </td>
  </tr>
</tbody>
</table>

---
#  Network Logical Data Structures

---
#  representing networks – simple undirected

<img src="assets/img/image23.png" width="100%" style="display: block; margin: auto;" />
---
#  representing networks – complex

<img src="assets/img/image24.jpeg" width="100%" style="display: block; margin: auto;" />
---

# representing networks – directed networks

---

# representing networks – bipartite networks

---

# describing networks

---

# describing networks

---

# describing networks

---

# describing networks

<img src="assets/img/image30.png" width="100%" style="display: block; margin: auto;" />
---
# representing networks – link types

<img src="assets/img/image31.png" width="100%" style="display: block; margin: auto;" />
---

# representing networks – network modes

<img src="assets/img/image32.png" width="100%" style="display: block; margin: auto;" />
---

# representing networks – directed networks

<img src="assets/img/image33.png" width="100%" style="display: block; margin: auto;" />
---

# representing networks – symmetric networks

<img src="assets/img/image34.png" width="100%" style="display: block; margin: auto;" />
---

# representing networks – affiliation networks

<img src="assets/img/image34.png" width="100%" style="display: block; margin: auto;" />
---

# Matrix Algebra

In this section, we will cover:

- Matrix Concepts, Notation & Terminologies
- Adjacency Matrices
- Transposes
- Matrix Operations

---
#  Matrices

+ Symbolized by a capital letter, like  A
+ Each cell in the matrix identified by row and  column  subscripts: a ij
  + First subscript is row, second is  column

<table class="table" style="font-size: 18px; margin-left: auto; margin-right: auto;">
 <thead>
  <tr>
   <th style="text-align:left;"> ID </th>
   <th style="text-align:left;"> Age </th>
   <th style="text-align:left;"> Gender </th>
   <th style="text-align:left;"> Income </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> Mary </td>
   <td style="text-align:left;"> a_11 </td>
   <td style="text-align:left;">  </td>
   <td style="text-align:left;">  </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Bill </td>
   <td style="text-align:left;">  </td>
   <td style="text-align:left;">  </td>
   <td style="text-align:left;">  </td>
  </tr>
  <tr>
   <td style="text-align:left;"> John </td>
   <td style="text-align:left;">  </td>
   <td style="text-align:left;"> a_32 </td>
   <td style="text-align:left;">  </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Larry </td>
   <td style="text-align:left;">  </td>
   <td style="text-align:left;">  </td>
   <td style="text-align:left;">  </td>
  </tr>
</tbody>
</table>

---
#  Vectors

+ Each row and each column in a matrix is  a  vector
+ – Vertical vectors are column vectors, horizontal are  row  vectors
+ Denoted by lowercase bold letter:  **y**
+ Each cell in the vector identified by subscript `$x_i$`

---
#  Ways and Modes

+ Ways are the dimensions of a  matrix.
+ Modes are the sets of entities indexed by the ways of a  matrix

---
#  Proximity Matrices

+ Proximity Matrices record “degree of  proximity”.
+ Proximities are usually among a single set of actor (hence, they are  1-mode), but they are not limited to 1s and 0s in the  data.
+ What constitutes the *proximity* is  user-defined.
	+ Driving distances are one form of proximities, other forms might be  number of friends in common, time spent together, number of emails  exchanged, or a measure of similarity in cognitive  structures.
	
---

#  Proximity Matrices

+ Proximity matrices can contain either *similarity* or *distance*  (or *dissimilarity* )  data.
	+ Similarity data, such as number of friends in common or correlations,  means a larger number represents more similarity or greater  proximity
	+ Distance (or dissimilarity data) such as physical distance means a larger number represents more dissimilarity or less  proximity

---
#  Transposes

+ The transpose `$M^'$` of a matrix `$M$` is the matrix flipped on its side.
	+ The rows become columns and the columns become  rows
	+ So the transpose of an m by n matrix is an n by m  matrix.

---
#  Transpose Example

---
#  Dichotomizing

+ X is a valued matrix, say 1 to 10 rating of strength of tie
+ Construct a matrix Y of ones and zeros s.t.
   `$y_{ij} = 1$` if `$x_{ij} > 5$`, and `$y_{ij} = 0$`  otherwise

<table class="table" style="font-size: 18px; margin-left: auto; margin-right: auto;">
 <thead>
  <tr>
   <th style="text-align:left;">   </th>
   <th style="text-align:left;"> EVE </th>
   <th style="text-align:left;"> LAU </th>
   <th style="text-align:left;"> THE </th>
   <th style="text-align:left;"> BRE </th>
   <th style="text-align:left;"> CHA </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> EVELYN </td>
   <td style="text-align:left;"> 8 </td>
   <td style="text-align:left;"> 6 </td>
   <td style="text-align:left;"> 7 </td>
   <td style="text-align:left;"> 6 </td>
   <td style="text-align:left;"> 3 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> LAURA </td>
   <td style="text-align:left;"> 6 </td>
   <td style="text-align:left;"> 7 </td>
   <td style="text-align:left;"> 6 </td>
   <td style="text-align:left;"> 6 </td>
   <td style="text-align:left;"> 3 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> THERESA </td>
   <td style="text-align:left;"> 7 </td>
   <td style="text-align:left;"> 6 </td>
   <td style="text-align:left;"> 8 </td>
   <td style="text-align:left;"> 6 </td>
   <td style="text-align:left;"> 4 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> BRENDA </td>
   <td style="text-align:left;"> 6 </td>
   <td style="text-align:left;"> 6 </td>
   <td style="text-align:left;"> 6 </td>
   <td style="text-align:left;"> 7 </td>
   <td style="text-align:left;"> 4 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> CHARLOTTE </td>
   <td style="text-align:left;"> 3 </td>
   <td style="text-align:left;"> 3 </td>
   <td style="text-align:left;"> 4 </td>
   <td style="text-align:left;"> 4 </td>
   <td style="text-align:left;"> 4 </td>
  </tr>
  <tr>
   <td style="text-align:left;">  </td>
   <td style="text-align:left;"> EVE </td>
   <td style="text-align:left;"> LAU </td>
   <td style="text-align:left;"> THE </td>
   <td style="text-align:left;"> BRE </td>
   <td style="text-align:left;"> CHA </td>
  </tr>
  <tr>
   <td style="text-align:left;"> EVELYN </td>
   <td style="text-align:left;"> 1 </td>
   <td style="text-align:left;"> 1 </td>
   <td style="text-align:left;"> 1 </td>
   <td style="text-align:left;"> 1 </td>
   <td style="text-align:left;"> 0 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> LAURA </td>
   <td style="text-align:left;"> 1 </td>
   <td style="text-align:left;"> 1 </td>
   <td style="text-align:left;"> 1 </td>
   <td style="text-align:left;"> 1 </td>
   <td style="text-align:left;"> 0 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> THERESA </td>
   <td style="text-align:left;"> 1 </td>
   <td style="text-align:left;"> 1 </td>
   <td style="text-align:left;"> 1 </td>
   <td style="text-align:left;"> 1 </td>
   <td style="text-align:left;"> 0 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> BRENDA </td>
   <td style="text-align:left;"> 1 </td>
   <td style="text-align:left;"> 1 </td>
   <td style="text-align:left;"> 1 </td>
   <td style="text-align:left;"> 1 </td>
   <td style="text-align:left;"> 0 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> CHARLOTTE </td>
   <td style="text-align:left;"> 0 </td>
   <td style="text-align:left;"> 0 </td>
   <td style="text-align:left;"> 0 </td>
   <td style="text-align:left;"> 0 </td>
   <td style="text-align:left;"> 0 </td>
  </tr>
</tbody>
</table>

---
#  Symmetrizing

+ When matrix is not symmetric, i.e., `$x_{ij}$` ≠  `$x_{ji}$`
+ Symmetrize various ways. Set `$y_{ij}$` and `$y_{ji}$`  to:

- Maximum(x_ij, x_ji): union  rule;
	- Minimum(x_ij, x_ji): intersection rule;
	- Average (x_ij+x_ji)/2	
	- Lowerhalf:	choose `$x_{ij}$` when `$i > j$` and `$x_{ji}$` otherwise

---
#  Symmetrizing Example

What rule are we using here?

<img src="assets/img/figure38b.png" width="100%" style="display: block; margin: auto;" />
---
#  Matrix Multiplication

- Matrix products are  not generally commutative (i.e., AB does not usually equal BA)
- Notation: `$C = AB$`
- only possible when the number of columns in A equals number of rows in B; these are said to be comformable. It is calculated as:

`$$c_{ij} = \sum a_{ik} * b_{kj} \quad \forall k$$`

---
# Matrix multiplication example i

---
# Matrix multiplication example ii

---
#  Products of matrices & their transposes

`$XX^T$` = product of matrix `$X$` by its transpose `$X^T$`

- Computes sums of products of each pair of rows (cross-products)
- Gives similarities among rows

---

<img src="assets/img/figure39.png" width="100%" style="display: block; margin: auto;" />
---
#  squaring an adjacency matrix