SODA: the action-packed finale

The drawback to being scheduled for the third day of SODA is that everyone is tired and much of the audience has departed, either physically or mentally. Despite that, there were still several talks that were good enough to catch and keep my attention.

Sadly, I slept through fellow UCI faculty Mike Goodrich's 50th-anniversary-of-shellsort talk. Mike provides a not-too-complicated variant of shellsort that sorts with high probability in \( O(n\log n) \) time. Unlike most other \( O(n\log n) \) sorts, but like the (too complicated to be practical) AKS sorting network, the pattern in which items are compared does not depend on their values, making this method suitable as a building block in certain cryptographic protocols.

Emmanuel Candès gave what I thought was the best of three good invited talks. The main theme was an analogy between compressed sensing (solving underdetermined linear systems using the assumption that the solution has sparse support), matrix completion (filling out the missing entries in a sparsely sampled matrix using the assumption that the matrix has low rank), and robust dimensionality reduction (representing a matrix by a low-rank substitute under the assumption that a small number of matrix entries have been corrupted).

In compressed sensing what one wants to find is the solution to the linear system with minimum \( L_0 \) norm (the number of nonzeros) but what one actually does is to solve a linear program minimizing the \( L_1 \) norm. \( L_1 \) is the closest convex function to the \( L_0 \) norm, the \( L_1 \) unit ball is the convex hull of the 1-sparse vectors, and for sufficiently well-behaved linear systems the \( L_0 \) and \( L_1 \) solutions coincide.
In matrix completion what one wants to find is the completed matrix with minimum rank but what one actually does is to solve a semidefinite program minimizing the nuclear norm, the sum of singular values of the matrix. Two examples that he described for this problem are the Netflix prize problem (completing a matrix indexed by customers and movies, containing the ratings for each movie) and finding Euclidean embeddings of sensor networks given information only on the distances between nearby pairs of sensors. The nuclear norm is the closest convex function to the rank and its unit ball is the convex hull of the rank-1 matrices.
To reduce a high-dimensional data set to a low-dimensional set that is somehow still representative of the data, what one usually does is to take the principle components of the singular value decomposition, but this is highly sensitive to outliers. Instead, Candes was proposing to perform an algorithm called "principal component pursuit" (that he didn't have time to describe) that minimizes a weighted sum of the nuclear norm of the low-rank part and the \( L_1 \) norm of the sparse part.

Seth Pettie gave a nice talk, appropriately heavy on pictures and light on text for the late position in the schedule. The point of his paper was to push a method for analyzing data structures, in which one makes a 0-1 matrix indexed by the sequence of operations and by the data structure objects with a 1 for each object touched by each operation. These matrices often obey certain simple patterns: some (non-contiguous) submatrices are disallowed, and that allows one to bound the total number of 1's in the matrix, which is the total running time. If an \( a\times b \) all-ones matrix is forbidden then the Kövári–Sós–Turán theorem implies that there are \( O(nm^{1 − 1/b} + m) \) ones in an \( m\times n \) matrix. If a permutation matrix is forbidden, the number of 1's can be only linear, and so on. Although there were no new results, Pettie described many existing ones where this point of view could be used to simplify or unify the analysis, including all the standard applications of Davenport-Schinzel sequences in computational geometry and several data-structural examples such as splay trees and union-find. Apparently he also had another forbidden matrix paper the previous day that I missed due to seeing something on \( k \) shortest paths at the same time.

The talk by Panagiotis Voulgaris on his work with Daniele Micciancio on finding shortest lattice vectors by repeatedly replacing members of a large enough sample of small vectors by their differences somehow reminded me of the Gröbner basis algorithm for monomial ideals, but by then I was too tired to think clearly so this could be a very superficial impression.

Some other SODA links: my own previous posts from this year's ALENEX and SODA are here, here, and here. Noam Nisan and Suresh Venkatasubramanian also have conference reports. Cora posted on talk formats and future location. And there was a lot of activity on twitter, especially concerning the business meeting and the decision to host SODA 2012 in Kyoto; most of it was under the #soda10 hashtag but I'm probably failing to link to additional tweets that weren't.