Fast farthest-first traversal

There are a lot of ways of ordering points so that nearby points are likely to be nearby in the ordering; space-filling curves are one good example. But what if you want nearby points to be far apart in the ordering? In this case, a good choice is the farthest-first traversal (also known as greedy permutation), the subject of my latest preprint (with Sariel Har-Peled and Tasos Sidiropoulos, arXiv:1507.01555). This is an ordering of the given points determined by a greedy algorithm, in which each successive point is chosen to be as far as possible from all previously-chosen points.

As far as I can tell, this permutation was first introduced as the ordering used for the "farthest insertion" heuristic for the traveling salesman problem, by Rosenkrantz, Stearns, and Lewis (SICOMP 1977), but it is better known from the work of Gonzalez on clustering (TCS 1985). One nice property of this permutation is that every prefix of it gives a well-spaced sample of the whole point set. So, as Gonzales showed, taking the first \( k \) points of this permutation as cluster centers, and assigning all remaining points to the nearest center, gives a good approximation to the \( k \)-center problem of partitioning the points into \( k \) clusters while minimizing the maximum cluster radius. You can compute the permutation only once, and then use the same permutation for clusterings with different numbers of clusterings, rather than having to recompute the whole clustering when \( k \) changes.

But the same properties are useful for many other problems, and have caused this same ordering to have been re-used and re-discovered for many other applications. The introduction of the preprint surveys a few of these, including choosing a small number of representative colors from the colors used by a given bitmap image in order to quantize the image (say when compressing down from png-16 to png-8); choosing a small number of representative points from the free space of a robot in order to build a compact road map for the free space; and selecting a well-spaced subset of points in the plane at a variable range of densities to use for dithering different grayscale levels into black-and-white dot patterns.

Our paper follows up a previous work by Sariel and M. Mendel (SICCOMP 2006) which defines an approximate version of the farthest-point traversal. Instead of choosing each point to be as far as possible from the set of previous points, we relax this constraint and allow points that are within a \( (1+\epsilon) \) factor of the farthest possible distance from the set of previous points. The farthest-point traversal can be computed in quadratic time, but Har-Peled and Mendel showed that for low-dimensional Euclidean spaces (or other spaces with similar properties) a near-linear time approximation is possible. The new contribution of our preprint is to show a similar near-linear time bound for metric spaces defined by the distances in sparse graphs, based on a randomized-incremental version of Dijkstra's algorithm. We also use locality-sensitive hashing techniques to find approximate greedy permutations in high-diensional Euclidean spaces in subquadratic time, and range-searching data structures to find exact greedy permutations in low-treewidth graphs in subquadratic time.

The tongue-twister title of this post was one that we had used for an earlier draft of the paper, but my co-authors made me take it out. Boo!