Report from SODA, ALENEX, and ANALCO, day 2
Eric Price started Monday's talks (and his first day of work at IBM Almaden) by presenting his paper with Greg Minton, which won SODA's best student paper award. The main topic was a tighter analysis of count-min sketches, but he also presented two lemmas that both look much more broadly useful. The first lemma was on center location in robust statistics: if you have a collection of samples from a distribution of the form \( c+E \) where \( c \) is an unknown but fixed real number and \( E \) is a (symmetric) noise distribution, what is the right way to estimate \( c \) from the samples? The mean will converge as quickly as possible to \( c \), but is very sensitive to outliers (data that does not come from the assumed distribution and may be arbitrarily far away). The median is completely insensitive to outliers, but will not converge to \( c \) (e.g. suppose that \( E \) is equally likely to be \( +1 \) or \( -1 \); then the median will almost always be one unit away from \( c \) no matter how many samples you take). What he proved is that pairing up the samples (in a way that does not depend on their values), taking averages of the pairs, and taking the median of the averages is both robust and quickly converging. The proof uses Fourier analysis and applies more generally to medians of distributions whose Fourier transform is positive (which is the case for count-min). The second lemma concerned voting systems: given a collection of voters whose votes are drawn from \( \{0,1\} \) with some probability, what is the right way of estimating which of \( 0 \) or \( 1 \) has the higher probability? The obvious way is to choose the majority. The less obvious way (essentially what is used in American presidential politics) is to group the voters into sets, choose a majority in each set, and choose the majority of the majorities. What Price wanted to prove is that simple majority is always best; what he actually proved is that it is within a small constant factor of being the best. He then applied this to boost certain high-probability bounds on count-min.
Piotr Indyk, Monday's scheduled invited speaker, turned out to be stuck in Chicago because of the recent weather-related air transportation snafus, so instead we got to hear Daniel Spielman a day early. Dan spoke on his recent solution of the Kadison–Singer conjecture (or rather of the Weaver K2 conjecture which implies it): very roughly, this says that a collection of vectors that looks spherical, in that the sum of its squared dot products with any unit vector is a constant unrelated to the unit vector, can be partitioned into two spherical subsets of approximately equal magnitude as long as some of the vectors are large. One (previously known) consequence is that any \( d \)-regular expander graph can be partitioned into two approximately \( d/2 \)-regular expander graphs, by applying this to a set of \( n \)-dimensional vectors defined from the edges of the graph. Rather than attempting any additional uninformed explanation, why don't I just direct you to the two resources Spielman directed us to: Nick Harvey's survey and Terry Tao's blog.
I stuck with ANALCO for the afternoon talks. The first of them, by Gwendal Collet, discussed his work with Bernardi and Fusy on combinatorial enumeration problems related to planar graphs. He discussed several bijections relating plane graphs with triangular outer faces, Eulerian plane triangulated multigraphs, and orientations of plane ternary trees, which can be applied to generate uniformly-random plane graphs with a given number of edges in linear expected time. I believe this is closely related to the topic of Don Knuth's 2013 Christmas tree lecture (and associated software):
Today was also the SODA business meeting. (I skipped the ALENEX/ANALCO meeting yesterday.) We started with a brief report from Cliff Stein (David Johnson's new replacement as steering committee chair) noting among other things that this year's SODA has the second highest attendance ever. Chandra Chekuri (program committee chair) thanked a lot of people, and then reported on acceptances by topic area. There used to be a perception that fixed parameter tractability was the European reaction to American approximation algorithms – SODA would take lots of approximation and ICALP and ESA would take lots of FPT – but this year FPT was one of the hot topics at SODA, with roughly a 44% acceptance rate compared to 27% overall. Data structures, however, did not fare so well, with only a 15% acceptance rate, something I find a bit worrisome since I think it's still an important subject. Chandra also made some suggestions for improvements to future SODAs, including a more complex hierarchy in the program committee to help it scale better, recording videos of talks (for an estimated $20 more per registration) and adding cookies to the afternoon coffee breaks.
This is the 25th anniversary of the founding of SODA, and David Johnson gave a brief talk on SODA's history. It was founded in 1989 with two purposes: to handle the overflow in algorithms papers from FOCS and STOC (hence its timing interleaved between those two conferences) and to bring discrete mathematicians into the mix with the computer scientists. Now, it's regularly larger than both FOCS and STOC. Various measures have been taken over the years to keep the discrete mathematicians happy (and try to forestall their papers as being rejected as off-topic by overeager CS reviewers) including requiring a fraction of the program committee to be mathematicians as well as some experiments with shorter paper formats (for many years a topic of long discussion at these meetings). However, David expressed some disappointment that the experimental component of algorithms research seems to have been unsuccessful at SODA (being mostly relegated to ALENEX these days).
The meeting finished with the usual discussion and vote (by rules made up as we went along based on what had happened in earlier stages) on where SODA 2016 will be (2015 having already been set for San Diego). Given SODA's long history of ignoring these votes and going somewhere else, nothing was announced as a final decision, but the four finalists were: Miami Beach (explicitly announced as not to include Miami, but probably too expensive to actually happen), 59 votes; Vancouver or Victoria, 47 votes; Washington DC, 36 votes; San Francisco, 31 votes. There's also a possibility that 2017 may be in Europe.
And finally for today, an intriguing suggestion from dinner conversation. You know how html links from one web page to another can be given a nofollow attribute so that search engines such as Google don't count the link towards the popularity of the target web page? We should have a protocol for doing that for citations from one paper to another. That way, we can feel more free to cite incorrect papers (saying why they're incorrect), without fear that by citing them we'll drive up their citation counts, cause them to rank earlier in Google scholar results, and encourage others to find them in searches and treat them as valid.
Comments:
2014-01-07T09:33:06Z
Thank you! Robust statistics sounds very interesting. Do you know if this a practical approach or is it more of a nice theoretical result? For instance, how does it compare to other "dumb" methods such smoothing?
2014-01-07T16:13:31Z
That's a complicated question. First of all, robust statistics is not new, and continues to have important applications (see e.g. the Applications section of the Wikipedia article on the Theil–Sen estimator). But my impression is that it has been driven out of the limelight of academic research for a couple of reasons: First, that its methods have tended to be somewhat slower and more complicated than non-robust methods such as least squares, and second, that it has been eclipsed by the success of Bayesian and machine learning techniques in statistics.
I'm not sure what you mean by smoothing — how is that a method for estimating a central tendency? There are other robust methods for this problem that also converge well that have long been known, such as the trimmed mean (find the 50% subset of the data that's closest together, and take the mean of that subset). But the simplicity and obvious implementability of this median-of-pairwise-averages method gives it some value, and I hope restores some interest in the algorithms community in robust statistics more broadly.
2014-01-07T23:47:20Z
Thanks, will take this into account. Regarding smoothing: I mean a simple average estimator: https://en.wikipedia.org/wiki/Additive_smoothing