# Efficiency of Rado graph representations

The Rado graph has interesting symmetry properties and plays an important role in the logic of graphs. But it's an infinite graph, so how can we say anything about the complexity of algorithms on it?

There are algorithmic problems that involve this graph and are independent of any representation of it, such as checking whether a first-order logic sentence is true of it (PSPACE-complete). But I'm interested here in problems involving the Rado graph where different ways of constructing and representing the graph lead to different algorithmic behavior. For instance: the Rado graph contains all finite graphs as induced subgraphs. How hard is it to find a given finite graph? Answer: it depends on how the Rado graph is represented.

Historically the first way of constructing this graph involved the binary representations of the natural numbers: each vertex corresponds to a number, and vertices \( x \) and \( y \) are adjacent when the smaller of the two numbers is the index of a nonzero bit in the binary representation of the larger of the two numbers. For this representation it's very easy to find a copy of any given graph \( G \) as an induced subgraph: just find copies of the vertices of \( G \) one at a time. Each of the vertices you've already found, and its adjacency or nonadjacency to the next vertex, tells you one bit of the binary representation of the next number, and you just have to pad those bits with zeros in the remaining places to find the next vertex. The trouble is, these numbers grow very quickly, roughly as a tower of powers of two whose number of levels is the number of vertices in the graph. For instance, if you want to find an 5-vertex complete subgraph of the Rado graph, you can do it with the numbers \( 0, 1, 3, 11, 2059, \) but (according to OEIS A034797) the smallest number you can use to extend this to a 6-vertex clique already has \( 620 \) decimal digits. And the one after that has more like \( 2^{2059} \) bits in its binary representation, too many to write down even in the biggest computers. So the algorithm for finding a given graph is easy to describe but not very efficient.

An alternative construction just chooses randomly, for each pair of vertices, whether they form the endpoints of an edge. With infinitely many vertices, the result of these random choices is almost certainly the Rado graph. That's not a representation that can be used in a computer, but we could imagine an algorithm that had access to it as some sort of oracle. With this representation, an \( n \)-vertex graph should occur much more quickly: if \( G \) is such a graph, then the expected number of copies of \( G \) among the first \( N \) vertices of the Rado graph starts getting large when \( N \) is roughly \( 2^{n/2} \). And that's the best you could hope for in any representation, because with fewer vertices there aren't enough \( n \)-tuples of vertices to cover all the different induced subgraphs that could exist. But finding a copy of \( G \) among these vertices would be difficult. Even for finding a clique, we don't know anything much better than trying all \( n \)-tuples of vertices and seeing which ones work. (Finding a clique of size approximately \( 2 \log_2 N \) in an \( N \)-vertex random graph in polynomial time is a well known open problem even though we know that a clique of that size should usually exist.) Yet another construction of the Rado graph, based on the idea of Paley graphs, probably behaves similarly to the random construction but is difficult to prove much about.

Here's a construction of an infinite graph in which induced subgraphs of any type are easy to find: instead of using binary numbers, use binary strings, of all possible lengths including zero. For any two strings \( s \) and \( t \), connect them by an edge if \( s \) is shorter than \( t \) and the position of \( t \) indexed by the length of \( s \) is nonzero (or vice versa). Then you can build an \( n \)-vertex graph one vertex at a time, by using one bitstring of each length less than \( n \), with the bits in each string given by the adjacencies to the earlier vertices with no padding. The copy of an \( n \)-vertex graph \( G \) will be somewhere in the first \( 2^n \) vertices (not \( 2^{n/2} \)), and the names of these vertices can be calculated and written down by an algorithm in time \( O(n^2) \) (matching the description complexity of \( G \) in terms of its adjacency matrix). But this is not the Rado graph. For instance, for the two binary strings “\( 0 \)” and “\( 1 \)”, there is only one vertex in the graph adjacent to one and not the other (the empty string) whereas the Rado graph has infinitely many such vertices. One could construct a copy of the Rado graph by interspersing this construction with a very small number of random vertices, small enough that they don't affect the complexity of this subgraph-finding algorithm, but that seems a bit of a cheat.

One way that it's a cheat is that it doesn't use the full power of the Rado graph. The actual defining property of the Rado graph is that if you start building a given induced subgraph, vertex by vertex, you can never make a mistake: it's always possible to add one more vertex. Or, more abstractly, if you have any two sets \( A \) and \( B \) of vertices in the Rado graph, there's always another vertex \( v \) that's adjacent to everything in \( A \) and nothing in \( B. \) By choosing \( A \) to be the set of already-placed vertices that are adjacent to the next vertex, and \( B \) to be the set of already-placed vertices that are not adjacent to the next vertex, you can use this property to find each successive vertex in an arbitrary induced subgraph. The graph of binary strings described above does not have this property, because when \( A=\{ \)“\( 0 \)”\( \} \) and \( B=\{ \)“\( 1 \)”\( \} \) there's no vertex \( v \) that matches.

Is it possible to construct the Rado graph in such a way that the extension property becomes as easy as the subgraph property was for the graph of binary strings? The short answer is that I don't know. One attempt at an answer would be to build it in levels, much like the binary string graph can be divided into levels by the length of its strings. In the \( k \)th level, we include a collection of vertices that extends all subsets of \( k \) vertices from the previous level. But what is this collection? If there are \( N \) vertices in the previous level, then the vertices of the \( k \)th level can be described by \( N \)-bit bitstrings specifying their adjacencies. We want to choose as small as possible a set of bitstrings with the property that all \( k \)-tuples of previous vertices can be extended; a more geometric way to describe this is that we want to find a small set \( D \) of points in the \( N \)-dimensional hypercube that hits every \( (N-k) \)-dimensional subcube. Exactly this problem was one of the ones I studied in my recent paper "grid minors in damaged grids". But the proof in that paper that \( D \) can be relatively small (Theorem 14) uses the probabilistic method, meaning essentially that it chooses a random set of the right number of hypercube points. So as a way of constructing Rado graphs in which the extension property is efficient, it is not an improvement over the method of choosing edges randomly. But maybe this nondeterministic proof that a good set exists can lead the way to a deterministic and efficient construction?

### Comments:

**None**:

**2014-09-10T13:32:27Z**

I am not quite sure if this is relevant (and I do not have the time to make up my mind on this right now), but are you aware of this deterministic construction of graphs satisfying extensions axioms which presumably is less well known than the paley graph construction?

**11011110**:

**2014-09-10T16:10:45Z**

I hadn't seen that, but it appears to be on finite graphs (satisfying all sufficiently small extension axioms) rather than infinite graphs satisfying all of the extension axioms.