SODA Day 3
I realize it's not exactly day 3 anymore, and it's all a bit of a blur by now, but I thought I'd finish my reporting of the conference regardless. Besides, my flight is delayed so I have plenty of time to write, I'm procrastinating on making slides for my next talk, and it's a good excuse to plug my SODA talk slides and my book, especially since Springer mysteriously failed to do the latter at their exhibit table.
The day began with a geometric data structure session. Guy Blelloch spoke about space savings in orthogonal range search; the key idea — used previously in compressed suffix tree problems but, apparently, not for geometry — was that, when storing pointers to items within short pieces of the data structure (such as the lists of items arising in fractionally cascaded structures), the pointers can be more efficiently stored as offsets from the start of the piece, using space proportional to the log of the piece size rather than log n. So most of the pointers used by his structures require only \( O(\log\log n) \) bits and can be packed \( O(\log n/\log\log n) \) per word.
Eric Chen described work with Timothy Chan on compressing Voronoi diagrams so that they are encoded only by the permutation of their generating sites, but still allowing nearest neighbor searches to be performed efficiently. It was tempting to ask whether he'd implemented his structure (it's extremely complicated) or how big \( n \) has to be before it beats sequential search... The main idea is closely related to a known technique that can be used in one dimension instead of two, to support nearest neighbor searches in the same time as binary search with a cache-optimized layout that might be useful in actual applications. Again, it involves forming sequences of \( n \) items with no data structure beyond the sequence itself. Store the \( \sqrt{n} \)th quantiles first, then the blocks into which they partition the rest of the sequence, ordering the quantiles and the elements within each block using a recursive version of the same algorithm. To search a sequence with this order, recursively search within the quantiles, then use the search result to find the block within which to continue the search. Thus, the search involves two recursive searches on \( \sqrt{n} \) items each, and the total number of search steps ends up being \( \log_2 n \), the same as for binary search.
Ken Clarkson spoke about a simple convex optimization technique that's been known since at least 1956 and rediscovered many times in many contexts since then, in which one approximately minimizes a convex function within a simplex (or maximizes a concave function) by repeatedly walking towards one of the simplex vertices. There are several variations, depending on how one chooses the vertex to walk towards (e.g. the one in the direction of the steepest gradient at the current solution) and how far to walk (line search for the optimum or a preset schedule of step lengths), but if the number of steps is significantly smaller than the number of simplex vertices one can still bound the approximation quality and get a sparsely supported solution, important in a lot of problems. There was also something about duals and something about using this idea to describe core-sets, but I'm unsure about the details. One of the papers that will repay reading carefully, I think.
Michael Mitzenmacher has covered Andrei Broder's invited lecture, on computational advertising (for the purposes of the talk, meaning how to select which text ad to place in a given web-page viewing), so I'll just direct you to his post for that.
I finally got a chance to see Jeff Erickson and Kim Whittlesey's cute one-and-a-half year-old daughter, Katherine, as they joined a group of us for lunch. Elena Mumford led us on a long walk to a sushi restaurant she'd seen earlier that looked quite good, but, sadly, turned out to be closed for lunch. So we walked farther, past some gay porn theaters, to another sushi restaurant that was also closed. But around the corner from that one we found another sushi restaurant that was open and good, and then on the way back we stopped for coffee. Which is a roundabout way of saying that I missed the early afternoon sessions. Had I more time and more of an attention span, my choice of parallel sessions would have been the one on data structures, and I heard later from Erik Demaine that I should check out Seth Pettie's paper from that session on splay trees. He attacks a special case of the “dynamic optimality conjecture,” that splay trees are within a constant factor of any rotation-based binary search tree on any input sequence. I think this case, the “deque conjecture,” means that items are inserted at one end of a sorted sequence and removed or queried from the other, and that the splay tree is expected to match the linear time performance of a linked-list-based deque implementation. If I remember correctly, it was known to within an \( \alpha(n) \) factor, and Seth improves it to the amazingly-close-to-constant \( \alpha^*(n) \), the number of times you need to iterate the inverse Ackermann function in order to reduce the argument to a constant, or maybe even to the logarithm of that. I don't have my proceedings available to check the details right now, but it sounds like another paper worth looking up.
The final session, and the one my talk was in, was another computational geometry one (though I think my own talk could have fit as well in the metric embedding sessions). Adrian Dumitrescu spoke on the ratio of the quality of solutions of 1-median problems when the center is free to be anywhere vs when it is restricted to one of the input points. A \( \sqrt{2} \) upper bound is known and not hard to prove; he and Csaba Tóth prove a slight improvement on this and several related problems. Their proof technique is a little unsatisfactory, though, as it consisted of reducing the problem to some large linear programs and then solving them numerically without even doin the interval arithmetic needed to be certain that the improvement is real and not roundoff. I'd be happy with something a little more conceptual and less computational, but I guess we've seen from the Kepler conjecture that one can't always expect that. Anyway, it made me wonder what the ratio would be for a different optimal-star problem, the minimum dilation star problem my student Kevin Wortman looked at with me a couple years back.
Chris Gray, Mark de Berg's student at Eindhoven, then spoke about “cutting cycles of rods in space”, work he'd done with Mark, Elena, and Boris Aronov. He started with what sounded like it was going to be a hokey analogy, but actually worked quite well, I thought, for describing the problem: a stack of logs is to be removed by helicopter, one at a time, but the helicopter can only remove a log if no other log lies atop it. The above-below relation can have cycles, in which case no log may be removed. How few cuts must be made, splitting logs into shorter logs, so that all pieces may be removed? They show both hardness and approximation results by relating the problem to vertex cover and feedback vertex set problems. There's a fixed-parameter-tractibility version of the question, in which one would like to find the optimal set of cuts in a time bound where the exponential dependence is only on the number of cuts and not on the number of input logs; Chris stated it as open in his talk but I'm a little unclear on whether it really still is. One interesting feature of his presentation was the use of some nice freely licensed forest photography from flickr for the title slides.
Jeff Erickson spoke on the geometric graphs formed by connecting any two points that lie on the boundary of an empty axis-aligned ellipse, though his main result was that random points on a circular cylinder in 3d have a Delaunay triangulation with complexity \( O(n\log n) \), closely related to the empty rectangle graphs I mentioned in my earlier report from Gábor Tardos' talk. I think it went well, but I have to admit that having seen Jeff speak at much greater length on the same subject two weeks earlier helped me follow it much better.
And finally, in the penultimate slot of the conference, my own talk. I spoke about a quadratic-time algorithm for recognizing partial cubes, citing as motivation for studying these graphs their applications in modeling voter preferences in elections, evaluation of what concepts a student knows and is ready to learn, and flip distances in triangulations. The algorithm has two phases, using two very different types of characterization of these graphs: the first is graph-theoretic, using a distance-based partition of edges into equivalence classes, while the second takes an automata-theoretic point of view that is the main subject of my new book with Falmagne and Ovchinnikov, Media Theory. I'm interested in using the recognition algorithm in searching for cubic partial cubes (in that context, it is not a theoretical improvement over the previous \( O(mn) \) time bound, but it allows some early exits that may make it faster in practice), but it's also relevant in the context of media theory as a way to convert between different possible input representations. My talk slides are here.
For more on ALENEX/ANALCO and SODA, see my previous entries (I, II, III), Suresh's (I, II, III), Michael Mitzenmacher (I, II), and Hal Daume.