When he visited UCI last May, Noga Alon gave two talks: a technical seminar on universal graphs and a more general talk on some of his work in the theory of machine learning. The seminar talk was based in part on his paper “Asymptotically optimal induced universal graphs” (Geom. Funct. Anal. 2017) in which he proves that the smallest graphs containing all \(k\)-vertex graphs as induced subgraphs have size

\[2^{(k-1)/2}\bigl(1+o(1)\bigr),\]

tight except possibly for the \(o(1)\) term. At the talk I asked him whether the same bound also worked for graphs that contain all \(k\)-vertex graphs except for the \(k\)-clique, but do not contain the \(k\)-clique. He came back a day later with the answer: Yes.

The reason for my question was that I needed these all-but-clique-universal graphs for a paper, now on arXiv: “Quasipolynomiality of the smallest missing induced subgraph”, with Andrea Lincoln and Virginia Vassilevska Williams, arXiv:2306.11185, to appear in J. Graph Alg. Appl. This is a followup to an earlier blog post on quasipolynomiality in which I observed that the smallest graph that is not an induced subgraph of some given graph could be found in quasipolynomial time. It necessarily has at most logarithmic size, so you can just do a brute force search of all subgraphs of that size to see which ones are missing. The new paper proves that, under the exponential time hypothesis, this sort of quasipolynomial time bound is necessary: you cannot solve the problem faster than the input size to a logarithmic power. The proof is an easy reduction from clique-finding in which one adds an all-but-clique-universal graph to the input to force the smallest missing graph to be a clique.

Another way of describing our paper is that it finds tight time bounds for determining the largest subgraph size for which a given graph is induced-universal. This adds to the small but growing repertoire of problems for which a quasipolynomial time bound is the correct time bound, under standard complexity theory assumptions. Others cited in the paper include finding approximate Nash equilibria, and finding maximum independent sets in hyperbolic unit disk graphs.

In our paper we use a recursive construction for all-but-clique-universal graphs that produces bigger graphs than Alon’s, of size roughly \(2^k\), for two reasons: first, because we need a deterministic construction (Alon’s is randomized), and second, because our paper was almost completely through the reviewing process when I found out about Alon’s construction. For proving quasipolynomiality, the bigger exponent in our construction isn’t important. But I thought this post would be a good place to describe Alon’s construction instead. Keep in mind that this is based on my faulty recollection of a conversation several weeks ago, so this will read more like half-baked research notes than like a completed result. I’ve probably introduced multiple mistakes into the construction: these are my fault, not Alon’s.

The main idea of Alon’s paper is to cover most induced subgraphs of size at most \(k\) by a big random graph, chosen with the specified number of vertices and with each edge present or absent independently with probability \(\tfrac12\). As he proves, this is (with high probability) a universal graph for all of the subgraphs that have few symmetries. But it may miss some highly symmetric subgraphs. To cover those, he shows that (regardless of its actual symmetries) every symmetric subgraph has three moderately-large ordered sets of vertices \(A\), \(B\), and \(C\) such that the \(A\)–\(B\) adjacencies have exactly the same pattern as the \(A\)–\(C\) adjacencies. Storing only one of these two sets of adjacencies saves enough information that he can use a less-efficient construction for a universal graph for the symmetric subgraphs. The result has size \(O(1/k)\) times the size of the big random part. Because it’s so much smaller, adding it to the big random graph only increases the \(o(1)\) term in their combined size bound.

Modifying this to produce an all-but-clique-universal graph requires only a few more ideas.

  • First, the big random part can contain a \(k\)-clique. But the expected number of copies of any graph in this part is inversely proportional to its number of symmetries. The size of this part is specifically chosen by a formula (equation 1 of Alon’s paper) that makes the expected number of copies equal to one when the number of symmetries is moderate, \(k^{16\sqrt{k\log k}}\). In this way, there are many expected copies for graphs with few symmetries (\(\le k^{8\sqrt{k\log k}}\)), which with some care leads to a proof that all such graphs are covered with high probability. But because the number of symmetries of a clique is \(k!\), much larger than the moderate size bound above, the expected number of cliques is much smaller than one. So choosing a random graph conditioned to have no clique doesn’t really change the high probability of covering all the few-symmetry graphs.

  • Second, the part \(S\) of Alon’s construction that covers the high-symmetry subgraphs will also contain a \(k\)-clique, but one can fix that by using the graph product \(S\times (K_{k}-e)\) with a clique minus an edge. Any \(k\)-vertex subgraph of this product either uses one copy of each clique vertex, and is missing the same edge, or it has two vertices coming from the same clique vertex, which cannot be adjacent in the product. Therefore, there are no \(k\)-cliques in the product, but all other \(k\)-vertex induced subgraphs remain present.

  • Third, the product blows up the symmetric part by a factor of \(k\), big enough to increase the multiplier in the size of the overall graph from \(1+o(1)\) to \(O(1)\). To avoid this increase, we need to compensate by being more careful in the construction of \(S\) to make it even smaller. I didn’t hear about the details of this part.

Putting these ideas together should give an all-but-clique-universal graph of size

\[2^{(k-1)/2}\bigl(1+o(1)\bigr),\]

the same bound as for the induced-universal graphs. If that’s all too sketchy to be relied on, you could always just use the product of any induced-universal graph with \(K_k-e\). It’s bigger by a factor of \(k\), but it has the right exponent and doesn’t have any gaps in my recollection of its construction.

(Discuss on Mastodon)