Manhattan embedding, paired approximation, and stragglers
I just uploaded two new papers to the arXiv, and my co-author updated an old one:
- Optimally fast incremental Manhattan plane embedding and planar tight span construction,
arXiv:0909.1866. The concrete problem solved here is testing whether a finite metric space can be represented as the distances among a set of points in the Manhattan plane. The corresponding Euclidean problem is easy (find a non-collinear triple of points, place them in a triangle, and use them to determine uniquely the Euclidean location of the remaining points) but the best previous problem for the Manhattan-metric variant took \( O(n^2\log^2 n) \) time; this paper removes the logs, getting the time bound down to the size of the input distance matrix, using the tight span as a key tool. The tight span is an analogue for finite metric spaces of the convex hull for finite Euclidean point sets. In general it can be computed as the set of bounded faces of a high-dimensional polytope, but that's not very efficient. The main ideas of the paper are to provide a different method of representing and constructing tight spans that are homeomorphic to a subset of the plane, using rectangular complexes and Manhattan orbifolds.
- Paired approximation problems and incompatible inapproximabilities,
arXiv:0909.1870. To appear at SODA. If two different approximation problems are defined from the same input and you'd be satisfied with a solution to either, it turns out that the quality of approximation you get can be significantly better than for either problem alone. For instance, suppose you want either a coloring or a long path in a graph. Both problems are hard to approximate to within a significantly sublinear approximation ratio (although we don't know how to prove it for the path problem) but approximating one or the other to within \( \sqrt{n} \) is easy using depth-first search: if the DFS tree has many levels it contains a long path, and otherwise the partition into levels is a good coloring. Along with several similar examples the paper contains some other pairs of problems (such as set cover and hitting set) where it's not possible to get a better approximation than for either problem alone; that part ws less easy.
- Straggler identification in round-trip data streams via Newton's identities and invertible Bloom filters,
arXiv:0704.3313. Now updated with some experimental results for the Bloom filter data structure as requested by the referees, I think primarily as a proof-of-concept to show that the algorithms are implementable. When we first did these experiments we thought we saw some interesting phenomena in them, showing that some complications that we'd included for theoretical reasons (we couldn't figure out how to prove that it worked without them) were also necessary to achieve good practical performance. Sadly, that turned out to be a bug in our implementation. But at least the debugged experiments still show that the extra space used for the added part of the data structure is not wasted: it gives us a proportionate increase in the size of the sets we can identify.