Non-crossing Hamiltonian paths and cycles in output-polynomial time

My paper “Non-crossing Hamiltonian paths and cycles in output-polynomial time”, to appear at SoCG, is now online as a preprint at arXiv:2303.00147. This is the full version; the SoCG version will need to be cut down by omitting proofs to reach the 500-line proceedings limit. It’s about polygonalization, the problem of finding all ways of connecting dots in the plane into a simple polygon (allowing connections that pass straight through a dot, but not allowing missing a dot altogether). The main results are that we can list all of these in time polynomial in the output size, and in polynomial time get an approximate count of them that is bounded above and below the true count by a polynomial of its value. Previously, the best we knew were that there were at most exponentially many polygonalizations and that we could list them in exponential time.

I think of this as being in the vein of recent conferences like the Symposium on Simplicity in Algorithms or the new “simplicity track” of the European Symposium on Algorithms: simple algorithms whose analysis isn’t. In fact, the algorithm in my paper isn’t even new. It’s the same one that was already used to achieve exponential time, in a paper “Algorithmic enumeration of surrounding polygons” by Katsuhisa Yamanaka, David Avis, Takashi Horiyama, Yoshio Okamoto, Ryuhei Uehara, and Tanami Yamauchi, published in 2021 in Discrete Applied Mathematics (doi:10.1016/j.dam.2020.03.03).

If we want to list all structures, from an exponentally large family of structures, in time polynomial per structure, then I think there’s really only one idea and a lot of elaboration on that idea. The idea is: describe your structures as the vertices of a large state space, with some sort of local operation for moving from state to state; prove that this local operation suffices to connect all the states together; and then apply a graph exploration algorithm like depth-first search to find all of the states from some starting state. The trouble is, for polygonalizations, we don’t know a good local operation. The obvious candidates, local moves that replace two or three edges of a polygon by a different set of edges, were proven not to work in a 2002 paper by Carmen Fernando, Michael Houle, and Ferran Hurtado (doi:10.1016/S0304-3975(01)00409-1). Instead, Yamanaka et al. propose to list all of the members of a larger family of structures, and then filter out the ones that are really polygonalizations. These more general structures are the “surrounding polygons” of their paper’s title.

A surrounding polygon is just a simple polygon that uses some of the given dots as vertices and contains the rest. The example below is taken from the last section of my paper. There I show that point sets like the one in the illustration, with one concave chain of dots inside a triangle, have \((n-1)2^{n-4}\) polygonalizations but a polynomially-larger number of surrounding polygons proportional to \(n(1+\varphi)^n\). Here \(\varphi\) is the golden ratio; this is not the first occurrence of the golden ratio in counting polygonalizations. A reviewer told me that these point sets are called “party-hat sets” or “ice-cream cone sets” but I’m not sure I believe it; I couldn’t find those names in a Google Scholar search.

A set of points in the form of a triangle with a concave chain of points replacing one of its edges, and a surrounding polygon of the points. The points that are vertices of the polygon are colored blue, and the other points surrounded by the polygon are colored red.

The simplest surrounding polygon of any input is just its convex hull. You can get from any surrounding polygon that is not the convex hull to a simpler one by “ear-cutting”: find two consecutive edges of the polygon that form two sides of an empty triangle outside the polygon, and replace them by a single shortcut edge. The shortcutted vertex becomes surrounded, and the area of the polygon grows, so repeated ear-cutting can only stop at the convex hull, implying that all surrounding polygons are connected through the convex hull. If you choose carefully which ear to cut, you give all surrounding polygons the structure of a tree, and the algorithm of Yamanaka et al. amounts to depth-first search of this tree. You can then find the polygonalizations just by running this algorithm and outputting only the surrounding polygons that use all the dots, at some tree leaves.

The idea of my new paper is to analyze these structures in the style of my book, Forbidden Configurations in Discrete Geometry, in terms of simple parameters of point sets that are monotonic (they don’t go down when you add more points) and that depend only on the order-type of the point set and not its exact coordinates. The question I set out to answer is: which point sets have only a very small number of polygonalizations, and which have many? I quickly identified two ways in which a point set could only have a small number:

Most of its points could belong to a single line. If a set of \(n\) points has \(n-k\) points on a line, and only a much smaller number \(k\) of points elsewhere, then most of the edges would have to connect paths of consecutive points along the line, and there aren’t very many ways of doing that. This number \(k\) is one of the parameters studied in my book. Working out the details of this argument showed more specifically that the number of polygonalizations is \(n^{O(k)}\): there are only \(O(k)\) points of any polygonalization where something interesting happens, and only \(O(n)\) choices for what happens there.
Most of its points could belong to the convex hull. If all points belong to the convex hull, then that is the only polygonalization. And if there are \(n-k\) points on the hull, and only a much smaller number \(k\) of points elsewhere, then the only points where something interesting happens are the \(O(k)\) points that are either not on the hull, or adjacent to a non-hull point. All the rest of their points have to be connected to their two hull neighbors. So again the number of polygonalizations is \(n^{O(k)}\). The parameter used here, the number of points interior to the hull, was not from my book, but maybe it should have been.

More strongly, upper bounds of the same form also apply to surrounding polygons. Allowing an interesting point to be skipped by the polygon doesn’t increase its number of choices much. Consecutive blocks of uninteresting points along a long line of points must either all be skipped or all be part of a surrounding polygon, again not increasing the number of choices by much. And a surrounding polygon cannot skip any point of the convex hull, because then it would not be surrounded. The part of the analysis that I found more difficult was proving that these are the only cases. If you have points that are mostly not on a line and mostly not on a hull, then there are exponentially many polygonalizations. And if you have one of the two situations with few polygonalizations described above, then the number of polygonalizations is accurately described by the upper bounds above. For details of these lower bounds, see the paper. The number of surrounding polygons can only be at least as large as the number of polygonalizations, because every polygonalization is a surrounding polygon.

Once that analysis was done, the algorithms for listing polygonalizations and for approximately counting them came for free. The lower bound and the upper bound on the number of polygonalizations have the same form as each other, so they give an accurate approximation without any more effort. And the bounds on the number of polygonalizations and on the number of surrounding polygons have the same form as each other, so the analysis of the algorithm for surrounding polygons (that it takes input-polynomial time per polygon) also shows that it generates all polygonalizations in output-polynomial time.

The “non-crossing Hamiltonian paths” of the new paper’s title are the same thing, but easier. The easier-to-generate structures are non-crossing paths, which you can form into a forest (rooted at the one-vertex paths) by a parent operation that removes the final edge of a path. And points in convex position still have many paths; the only point sets that have a small number of non-crossing Hamiltonian paths (or non-crossing paths) are the ones with most of the points on a single line.

(Discuss on Mastodon)