The exponential random graph model is a system for describing probability distributions on graphs, used to model social networks. One fixes a set of vertices, and determines a collection of "features" among the edges of this fixed set (such as whether or not a particular edge or combination of a small number of edges), each with an associated real-valued weights. Then to determine the probability of seeing a particular graph, one simply looks at which features it has; the probability is exp(sum of feature weights), divided by a normalizing constant (the "partition function").

This is a good model for several reasons: it is powerful enough that (with complicated enough features) it can describe any distribution. With simple enough features (e.g. just individual edges) it degenerates to the standard Erdős–Rényi–Gilbert model of random graphs. It's easy to express features that model sociological theories of network formation, such as assortative mixing (more similar people are more likely to be friends) or triadic closure (friends-of-friends are more likely to be friends). And by fitting the weights to actual social networks, one can learn something about the strengths of these effects.

But on the other hand, there are some theoretical and practical obstacles to its use. It seems to be difficult to set up features and weights such that, when one generates graphs using the distribution they describe, the results actually look like social networks. If we go even a little bit beyond the Erdős–Rényi–Gilbert model we don't have closed form solutions to anything and have to use MCMC simulations to compute the partition function, fit weights, or generate graphs, and we don't know much about how quickly or slowly these simulations converge.

And now, with my latest preprint "ERGMs are Hard" (arXiv:1412.1787, with Michael Bannister and Will Devanny), the picture gets darker. We prove complexity-theoretic hardness results showing that with completely realistic features (in fact the ones for assortative mixing and triadic closure, but with unrealistic weights) we can't compute the partition function, we can't get anywhere close to approximating the partition function, we can't generate graphs with the right probabilities, and we can't even get anywhere close to the right probability distribution. And trying to escape the hardness by tweaking the features to something a little more complicated doesn't help: the same hardness results continue to be true when the features include induced subgraphs of any fixed type with more than one edge.

The short explanation for why is that lurking inside these models are computationally hard combinatorial problems such as (the one we mainly use) finding or counting the largest induced triangle-free subgraphs. It was known that the maximization version of this problem was hard, but the reduction wasn't parsimonious (less technically, this means that the reduction can't be used to prove that the counting version of the problem is hard). So for that part we had to find our own reduction from another hard counting problem, counting perfect matchings in cubic bipartite graphs. Here it is in a picture.

Parsimonious reduction from perfect matchings in cubic bipartite graphs to maximum induced triangle-free subgraphs

Each vertex of the original graph turns into four triangles after the reduction. In this example, counting matchings in a cube turns into counting maximum triangle-free subgraphs of a snub cube. These subgraphs are formed by deleting all the snub cube edges that correspond to unmatched cube edges, and then deleting one more edge inside each four-triangle gadget. When I posted a drawing of a stereographic projection of a snub cube a bit over a year ago, this is what it was for. Since that time, we've also been using the image of this reduction in the logo of our local research center.


None: Andy D

"And now, with my latest preprint ... the picture gets darker."

Love it!

None: More darkness

Hi David,

Very nice results. It's very likely you know the references below, but I thought I'd raise them as they're squarely on the topic of the darkness of ERGMs.

Regarding convergence rate of sampling schemes, the papers by Bhamidi-Bresler-Sly ( and Chatterjee-Diaconis ( shed quite a bit of light on those questions, though perhaps you're using a strong sense of "much".

Also, your "any subgraph with more than one edge" result reminded me of a talk I saw by Cosma Shalizi on consistency of ERGMs, which I believe hinged on a similar distinction:

Johan Ugander