Some highlights from FOCS

Lance Fortnow has already posted a high-level overview of this year’s FOCS conference (the annual IEEE Symposium on Foundations of Computer Science). So instead of posting something similar I thought I’d share my impressions of a few of the contributed talks that particularly caught my attention. In many cases, 20-minute videos for these talks are linked from the conference schedule, and are more complete than the far-too-short 12-minute talks that were presented in person.

I started the first day of contributed talks by going to the graph structure theory session, missing three talks on data structures that I would otherwise have wanted to see:

“\(O(1)\) insertion for random walk \(d\)-ary cuckoo hashing up to the load threshold” by Bell & Frieze [arXiv, video],
“On approximate fully-dynamic matching and online matrix-vector multiplication” by Liu [arXiv], and
“Fully dynamic matching and ordered Ruzsa–Szemerédi graphs” by Behnezhad and Ghafari [arXiv, video].

Of the talks I saw in the session,

“First-order model checking on monadically stable graph classes” by Dreier et al. [arXiv, video] concerns checking whether a formula in the first-order logic of graphs describes a given graph. This is slow for general graphs (polynomial but with a high exponent, the depth of quantification in the formula) but near linear time for sparse graphs and some others, and this paper exactly characterizes which other hereditary classes of graphs it is fast for.

Another standout from this session was

“Minor containment and disjoint paths in almost-linear time” by Korhonen, Pilipczuk, and Stamoulis [arXiv, video], which speeds up problems of testing whether a graph belongs to a minor-closed family of graphs from \(O(n^3)\) (the original work of Robertson and Seymour) to \(O(n^2)\) (long known from previous work) to near-linear. The basic framework has been the same all along, a win-win where if the graph has low treewidth you can find minors quickly by dynamic programming and if it has high treewidth you can find and remove an irrelevant vertex. The new speedup comes from combining recent breakthroughs in max flow and mimicking networks with dynamic graph algorithms for treewidth to speed up the removal of irrelevant vertices. It’s a shame it was scheduled opposite another flow-related dynamic graph algorithm paper (the one by Behnezhad and Ghafari).

I made up for missing the morning data structures talks by attending an afternoon session with a cluster of three papers on hash tables using open addressing, all by William Kuszmaul (a new assistant professor at CMU) with varying coauthors:

“Optimal bounds for open addressing without reordering” [video]
“Tight analyses of ordered and unordered linear probing” [video]
“Tight bounds for classical open addressing” [arXiv, video]

They all concern the behavior of hash tables that approach being full, with only some fraction \(\delta\) of unused space remaining. The first one disproves a conjecture of Yao on the optimality of uniform hashing (in which the probe sequence is uniformly random), for hash tables that don’t move already placed keys. This gives expected search time \(O(1/\delta^2)\) but one can instead achieve \(O(\log^2 1/\delta)\) by partitioning the table into levels with progressively fewer cells and lower expected density, and making logarithmically many probes into each level before giving up and moving on to the next. The second one concerns lazy versus non-lazy deletion in linear-probing hash tables; we often teach our students the non-lazy version, but in this paper both experimental and theoretical results show that lazy may be better when combined with a version of linear probing that keeps runs of occupied cells sorted by their hash values. The third of these papers shows that, if moving existing keys is allowed, the optimal expected update time is \(O(\log\log 1/\delta)\), with \(O(1)\)-time queries. Its structure, a “rainbow hash table”, is structured as an exponential tree that recursively partitions the table into sub-hash-tables that are almost entirely full, and in which updates are fast but queries are very slow. The trick is that almost all the work happens at the bottom level of the recursion where these subtables have constant size.

The next day featured the best student paper (Machtey Prize) talks:

“Capacity threshold for the Ising perceptron” by Brice Huang [arXiv].
“Optimal quantile estimation: beyond the comparison model” by Meghal Gupta, Mihir Singhal, and Hongxun Wu [arXiv].

Of the two, the second one was closer to my own interests. It concerns streaming algorithms, algorithms that are given access to a stream of data values but do not have the memory to store the data before processing it. The goal here is, after seeing the stream, to estimate the rank of a query value (its position in the sorted order of stream values) or, almost equivalently, to find a value of given approximate rank. It shows that, for a stream of \(n\) elements, you can estimate the rank to within additive error \(\varepsilon n\), using only \(O(1/\varepsilon)\) words of memory; previous methods used logarithmically more memory. To achieve this, they have to assume that the values are integers from some universe. The basic idea is to recursively partition this universe into subintervals, to assign capacity \(\varepsilon n/\log U\) to each subinterval, and to store counts of elements in each subinterval, assigning each incoming element to the highest-level subinterval that has not yet reached its capacity. This would give a valid streaming algorithm but with too much memory (as with previous solutions); to improve this they store only approximate counts and batch the updates using a smaller recursive sketch to store each batch.

Later that afternoon, I saw:

“Hardness of packing, covering and partitioning simple polygons with unit squares” by Abrahamsen and Stade [arXiv]. This is just a standard \(\mathsf{NP}\)-completeness proof, of the following problem: the input is an axis-parallel polygon with half-integer coordinates, without holes, and you have to pack as many axis-aligned unit squares as you can into it. It got into FOCS because it has been over 40 years since the version with holes was proven to be hard and over 20 years since this version was conjectured (incorrectly) to be polynomial. The difficulty is finding a way to set up signals that can propagate horizontally and vertically through the empty space of a simple polygon without interfering with each other.

The same session also featured

“The orthogonal vectors conjecture and non-uniform circuit lower bounds” by Ryan Williams [ECCC] which I already made a brief post about.

The final day of the conference saw me switching between sessions several times. In the morning session, I saw

“Computing approximate centerpoints in polynomial time” by Y. Cherapanamjeri [video]. A centerpoint, for a set of sample points, is a point with high Tukey depth: every halfplane containing the centerpoint also contains a large fraction of the sample points. The optimal fraction (in the worst case) is \(\tfrac{1}{d+1}\), where \(d\) is the dimension of the points, but finding a point with that depth takes time depending badly on the dimension. One of my old papers, “Approximating center points with iterated Radon points” achieves time polynomial in the dimension and the number of points, but with depth only \(\Omega(\tfrac{1}{d^2})\). This paper shows that if you’re willing to accept a point that is within a small distance of a centerpoint, but might not be a centerpoint itself, then in polynomial time you can get depth \(\Omega(\tfrac1d)\). There is also a way of interpreting it as an actual centerpoint of smoothed points.

Then, switching to a different session, I saw

“Towards instance-optimal Euclidean spanners” by Le et al. [arXiv, video]. This is about spanners in moderate-dimensional Euclidean spaces, sparse graphs weighted by Euclidean distance in which the shortest path distance is approximately the same as in the complete graph. Typically we wish these graphs to be both sparse (few edges) and light (small total weight), and the greedy spanner does well on this by traditional measures: its number of edges is within a small factor of the worst-case number of edges needed for some inputs, and the ratio of its weight to the minimum spanning tree is within a small factor of the worst-case ratio needed for some inputs. But when we go beyond worst case by looking at the approximation ratio (comparing the output quality to the quality achievable on the same input) the greedy spanner is not so good. This paper shows that adding a simple local improvement stage that repeatedly replaces multiple edges by a single shortcut can get within a constant approximation ratio, in a bicriterion sense where the resulting spanner has a somewhat higher stretch factor than the optimal one it is being compared to. As far as I can tell, that’s all “instance-optimal” means in this case: a trendy new way of saying it has a good (bicriterion) approximation ratio.

This was the day for the best paper talks. These were:

“Universal optimality of Dijkstra via beyond-worst-case heaps”, by Haeupler et al. [arXiv] is on a very old topic, Dijkstra’s algorithm for single-source shortest paths. Dijkstra’s algorithm doesn’t just find distances and paths; it also sorts vertices by their distance from the source. If you really want that sorted order, it can be made to take within a constant factor of optimal time (for comparison algorithms running on the same digraph with varying edge weights) by using a special priority queue, one with the “working set property”. This means that finding and removing the minimum priority element takes time logarithmic in the number of operations since its insertion, rather than logarithmic in the heap size. It’s the same property also used by the same coauthors for sorting partially ordered data.
“Near-optimal deterministic network decomposition and ruling set, and improved MIS”, by Ghaffari and Grunau [arXiv]. The important context that you might not get from the title: this is on the derandomization of distributed algorithms, in the “LOCAL” model.

For the final contributed session, I switched around again, starting with:

“Nearly optimal list labeling” by Bender et al. [arXiv]. The talk was presented by Hanna Komlós, who is by the way on the postdoc market and (I’m told) very strong. In the list-labeling problem, one must assign natural numbers to a dynamic set of ordered items, so that the ordering of the numbers is the same as the ordering of the items. Items can be renumbered but you want to minimize how often this happens. The version studied here restricts the numbers to be only a small constant times the number of items, which makes the problem much more difficult. The same authors previously proved \(O(\log^{3/2} n)\) renumberings per update (in expectation for a randomized algorithm, and here they improve it to \(\tilde O(\log n)\), nearly optimal because there is an \(\Omega(\log n)\) lower bound. But the best deterministic method is still \(O(\log^2 n)\) so there’s still more to be done.

After this I switched to another session, where I saw:

“Faster \((\Delta+1)\)-edge coloring: breaking the time barrier”, by Bhattacharya et al. [arXiv, video]. This is on algorithmic versions of Vizing’s theorem, according to which a simple undirected graph can have its edges colored using at most one more color than the maximum vertex degree. This degree is an obvious lower bound for the number of colors, so you can always get within one of optimal. This paper improves time \(\tilde O(m\sqrt{n})\) to \(\tilde O(mn^{1/3})\), but as we heard in the talk, this is not the end of the story. There are two more papers on the subject coming at SODA 2025. An even more recent preprint, “Vizing’s theorem in near-linear time” [arXiv], by a combination of the authors from these various papers, gives a randomized algorithm with a high-probability time bound of \(O(m\log\Delta)\).
“Naively sorting evolving data is optimal and robust” by Giakkoupis, Kiwi, and Los [arXiv, video]. The model here is that you’re trying to keep track of a sorted sequence by making comparisons and using them to update your model of the sorted order, but while you’re doing so the sequence is being reordered by an adversary who randomly selects an item and moves it one position up or down. I was involved in an ICALP 2018 paper on this subject where we showed that, if the adversary makes one move for each comparison you do, then repeatedly performing insertion sort can keep your model within linear inversion distance of the true sorted order, the best you can do. But we were unable to handle more general adversaries with a higher rate of change. This paper closes that gap and more, with a much simpler sorting algorithm: repeatedly compare-and-swap two items that are adjacent in your current order.

I’m very happy to have seen so much in the way of concrete algorithms at FOCS!

(Discuss on Mastodon)