ERGM Mathematical Derivation

Computational Social Science · Network Analysis
What Is This?

A step-by-step mathematical derivation of the Exponential Random Graph Model (ERGM). We start from a simple question — "what is the probability of observing this network?" — and arrive at a logistic regression form.

Prerequisites

Familiarity with basic network concepts (nodes, edges, adjacency matrix). Basic understanding of probability and logarithms. No prior ERGM knowledge needed.


Starting Point

What Do We Want?

We want to answer a fundamental question: what is the probability of observing a particular network \(y\)? That is, among all the networks that could exist with these nodes, why did we see this specific pattern of ties?

The Question $$P(Y = y) = \;?$$
Goal: Build a model that assigns higher probability to networks whose structural patterns (reciprocity, triangles, etc.) match what we observe in real social systems.
Step 1

The Simplest Idea

If every tie in the network were independent and random, every possible network would be equally likely:

Uniform Distribution $$P(Y = y) = C \quad \text{(some constant)}$$
Problem: This is too simple — it tells us nothing about network structure. Real networks are not random: friendships cluster, advice flows to experts, and ties tend to be reciprocated. We need a model that captures these patterns.
Step 2

Network Structure Matters

We want to say: certain patterns (reciprocity, triangles) make a network more likely to appear. So we define a "score" for any network:

Network Score $$\text{score}(y) = \theta_1 \times g_1(y) + \theta_2 \times g_2(y)$$
  • \(g_1(y)\) = number of reciprocal pairs in the network
  • \(g_2(y)\) = number of triangles in the network
  • \(\theta_1, \theta_2\) = weights (parameters) for each pattern
Intuition: Higher score → this network is more likely to be observed. A positive \(\theta_1\) means reciprocity makes a network more probable; a negative one means reciprocity is suppressed.
Step 3

Turn "Score" into "Probability"

Probabilities must be positive, but our score can be any real number. The exponential function solves this — it maps any number to a positive value:

Exponentiate $$P(Y = y) \;\propto\; \exp\!\Big(\theta_1 \cdot g_1(y) + \theta_2 \cdot g_2(y)\Big)$$
Note: The symbol \(\propto\) means "proportional to" — this is not a true probability yet because all probabilities must sum to 1. We fix that in the next step.
Step 4

Make Probabilities Sum to 1

We divide by a normalizing constant \(\kappa\) — the sum of the exponentiated scores over every possible network:

Normalizing Constant $$\kappa = \sum_{y' \in \mathcal{Y}} \exp\!\Big(\theta_1 \cdot g_1(y') + \theta_2 \cdot g_2(y')\Big)$$
The Full ERGM Formula $$\boxed{\;P(Y = y) = \frac{1}{\kappa}\,\exp\!\Big(\theta_1 \cdot g_1(y) + \theta_2 \cdot g_2(y)\Big)\;}$$
The catch: For a network with \(n\) nodes, there are \(2^{n(n-1)}\) possible directed networks. Even with just 20 nodes, that's over \(10^{114}\) networks. Computing \(\kappa\) by brute force is impossible. But — as we'll see — it cancels out!
Step 5

Zoom In on a Single Tie

Instead of asking about the entire network, ask a more practical question: what is the probability that the tie Alice → Bob exists?

Define two network states:

  • \(y^+_{ij}\) = the network with the tie Alice → Bob
  • \(y^-_{ij}\) = the network without the tie Alice → Bob

The odds of this tie existing (holding all other ties fixed):

Odds of a Single Tie $$\frac{P(Y = y^+_{ij})}{P(Y = y^-_{ij})} = \frac{\;\dfrac{1}{\kappa}\exp\!\Big(\theta_1 g_1(y^+) + \theta_2 g_2(y^+)\Big)\;}{\;\dfrac{1}{\kappa}\exp\!\Big(\theta_1 g_1(y^-) + \theta_2 g_2(y^-)\Big)\;}$$
Step 6

\(\kappa\) Cancels Out

\(\kappa\) appears in both the numerator and denominator, so it disappears:

κ Cancels $$\text{odds} = \frac{\exp\!\Big(\theta_1 g_1(y^+) + \theta_2 g_2(y^+)\Big)}{\exp\!\Big(\theta_1 g_1(y^-) + \theta_2 g_2(y^-)\Big)}$$

Using the rule \(\exp(A)/\exp(B) = \exp(A - B)\):

Simplified $$\text{odds} = \exp\!\Big(\theta_1\big[g_1(y^+) - g_1(y^-)\big] + \theta_2\big[g_2(y^+) - g_2(y^-)\big]\Big)$$
Step 7

Define the Change Statistics \(\Delta g\)

The differences in the brackets have a special name — change statistics:

Change Statistic $$\Delta g_k = g_k(y^+_{ij}) - g_k(y^-_{ij})$$
In words: \(\Delta g_1\) answers "how much does the reciprocity count change when we add this one tie?" For example, if Bob already has a tie to Alice, then adding Alice → Bob creates one new reciprocal pair, so \(\Delta g_1 = 1\).

Substituting \(\Delta g\) into our formula:

Odds with Change Statistics $$\text{odds} = \exp\!\Big(\theta_1 \,\Delta g_1 + \theta_2 \,\Delta g_2\Big)$$
Step 8

Take the Log of Both Sides

Taking the natural logarithm:

Logistic Regression Form $$\boxed{\;\log(\text{odds}) = \theta_1 \times \Delta g_1 + \theta_2 \times \Delta g_2\;}$$

This is exactly the form of logistic regression. The log-odds of a tie existing is a linear combination of the change statistics, weighted by the model parameters.

Interpretation: Each \(\theta_k\) is a log-odds ratio, just like in logistic regression. For instance, if \(\theta_1 = 0.8\), then a tie that creates one new reciprocal pair has \(e^{0.8} \approx 2.2\times\) higher odds of existing.

Summary

Full Derivation at a Glance

Probability of Entire Network
$$P(Y=y) = \frac{1}{\kappa}\exp\!\Big(\textstyle\sum_k \theta_k\, g_k(y)\Big)$$
divide two states → \(\kappa\) cancels
Odds of a Single Tie
$$\text{odds} = \exp\!\Big(\textstyle\sum_k \theta_k\, \Delta g_k\Big)$$
take log of both sides
Logistic Regression Form
$$\log(\text{odds}) = \textstyle\sum_k \theta_k\, \Delta g_k$$
The key insight: \(\kappa\) is computationally intractable — but it cancels naturally in the derivation. This is why we can estimate \(\theta\) using MCMC (Markov Chain Monte Carlo) without ever computing \(\kappa\) directly.