The metric space of star metric spaces
Suppose that \( f \) is a function that maps a set \( X \) to the positive real numbers. Then we can define a metric space from \( f \) by defining the distance from \( x \) to \( y \) to be \( f(x)+f(y) \) (with \( d(x,x)=0 \) in exception to this formula); this automatically satisfies the required symmetry, positivity, and triangle inequality properties. The resulting metric space can be represented as the shortest-path distances in a star (tree with one non-leaf node h, its "hub"), by placing \( X \) at the leaves of the star and setting the length of the edge from leaf \( x \) to hub \( h \) to be \( f(x) \). We can include \( h \) itself in the metric space, setting \( f(h)=0 \). If we do include the hub, the spaces coming from this construction can then be characterized by the existence of a point \( h \) that belongs to a shortest path between any other two distinct points: for all \( x \) and \( y \) with \( x\ne y \), \( d(x,y)=d(x,h)+d(h,y) \). The function \( f \) can be recovered as \( f(x) = d(x,h) \).
My latest arXiv preprint, Optimal Embedding Into Star Metrics (with Kevin Wortman, arXiv:0905.0283, to appear at WADS) looks at the problem of embedding an arbitrary metric into a star metric of this type in order to minimize the amount by which distances are distorted in the embedding. We may as well scale the target star metric so that all of its distances are greater than the distances in the input metric, so the problem becomes one of, given a metric space \( (X,d) \), finding a function \( f \) on \( X \) that defines a star metric with distances at least equal to \( d \), minimizing the dilation \[ \max_{x,y} \frac{f(x)+f(y)}{d(x,y)}. \] As we show, the problem can be solved polynomially by transforming it into a parametric shortest path problem in an auxiliary graph defined from \( X \).
But what is this space of star metrics into which \( (X,d) \) can be embedded? The requirement that distances be non-decreasing translates into a property that, for every \( x \) and \( y \), \( f(x)+f(y)\ge d(x,y) \), but we don't want the distances in \( f \) to be unnecessarily large, so we should restrict our attention to minimal functions, with the property that each \( x \) has a \( y \) for which the inequality above turns into an equality. If no such \( y \) existed, we could reduce \( f(x) \) leaving all the other values unchanged and get a better overall star. That is, for each \( x \) there should exist \( y \) such that \( f(x)+f(y)=d(x,y) \). (This formula works directly only when \( X \) is finite; a similar formula involving suprema is needed when \( X \) may be infinite.)
As it turns out, if \( T_X \) is the family of minimal star metrics into which \( (X,d) \) can be embedded, then \( T_X \) itself can be given the structure of a metric space, and \( (X,d) \) can be embedded into \( T_X \) with zero distortion. Given two functions \( f \) and \( g \) in \( T_X \), define the distance between them to be the sup-norm or \( L_\infty \) norm \[ d(f,g) = \sup_x |f(x)-g(x)|. \] Each point \( x \) of \( X \) defines a star with \( x \) as the hub, defined by the function \( f_x(y)=d(x,y) \), and the sup-norm distance \( d(f_x,f_y) \) equals the original distance \( d(x,y) \). This space \( T_X \) of minimal star metrics itself has other nice properties (it contains a line segment connecting any two points, and any pairwise-intersecting collection of metric balls has a common intersection) that can be summarized by saying that it is injective or hyperconvex. Every metric space \( (X,d) \) can be embedded without distortion into an injective space, and the space of minimal star metrics is the smallest injective space into which \( (X,d) \) can be embedded, its tight span.
Therefore, mapping \( (X,d) \) in a distance-nondecreasing way into a star metric can be viewed as picking a single point that represents the point cloud \( X \) within a larger metric space, the tight span \( T_X \). Kevin and I previously studied the corresponding problem of picking a single star center to minimize dilation in bounded-dimension Euclidean spaces in an earlier paper; the new paper solves the same problem for injective spaces of unbounded dimension. But dilation is not the only way to measure the quality of a point representing a point cloud. Finding the star metric for \( X \) that minimizes the maximum distance between any two points (the 1-center or circumradius problem in a tight span) turns out not to be very interesting: the solution is half the diameter of \( X \) (the largest number in its distance matrix). But finding a star metric for \( X \) that minimizes the average distance (the 1-median or Fermat–Weber problem in a tight span) is less trivial. It appears to be the case that:
The optimal solution is determined by the maximum-weight 2-factor for \( X \) (that is, the maximum weight collection of disjoint cycles through the points in \( X \), allowing 2-cycles whose weight is twice the distance between the endpoints).
The maximum 2-factor has the form of a collection of odd cycles and 2-cycles.
Within each odd cycle, the distance from a vertex x to the hub is half of an alternating sum of the edge lengths in the cycle, in which the two edges incident to x are both added and otherwise edges of the cycle alternate between being added and subtracted.
There may be some freedom to choose the distances between endpoints of a 2-factor and the hub, as long as the sum of the two distances equals the distance between the endpoints.
If all this is true, the star embedding for \( X \) that minimizes average distance can also be found in polynomial time. (If it's not true, the optimal embedding can still be found in polynomial time by linear programming, but it's good to have a more combinatorial algorithm.)
One may also be wondering, at this point, are star metrics injective, and if not, what is the tight span of a star metric? The first answer is no: nontrivial injective spaces must have infinitely many points and star metrics may be finite. But even an infinite star metric can never be injective, because in a star metric the only point that lies on a shortest path between any two other points is the hub whereas in an injective metric space any two points have a whole shortest path of points between them. The tight span of a star metric is a hedgehog space, the space formed by connecting each point of the star to the hub by a line segment of the appropriate length.