I recently asked my data structures class the following question: suppose you fill in an \( n \)-cell array by a random permutation. What is the probability that the result is a valid binary min-heap?

One way to solve this would be to count heaps, then divide by \( n! \). For instance, for \( n=5 \), there are eight heaps: there are four choices for the right child of the root, and two choices for how to order the two grandchildren, so the probability is \( 8/120 = 1/15 \).

The eight possible binary min-heaps on five elements

Alternatively, you can compute the probability a different way. Each node in the tree must have the minimum value among all its descendants. For a random permutation, the probability that it does is one over the number of descendants (counting the node as a descendant of itself). And the events that two nodes are minimum among their descendants turn out to be independent, so we can just multiply these probabilities together. A five-element heap has nodes whose numbers of descendants are 5, 3, 1, 1, 1, so the probability of getting a heap is one over the product of these numbers, \( 1/15 \). This calculation explains why the probability is a unit fraction, something that seems a bit mysterious when you calculate it the other way.

But now we have a nice sequence of integers, the inverses of these probabilities for different choices of \( n \): 1, 2, 3, 8, 15, 36, 63, etc. And when we have a nice sequence of integers, we'd like them to count things. What do these numbers count? One possible answer is that they count a different set of labelings on the same complete binary trees: labelings for which, at each node, all left descendants are smaller than all right descendants. There are 15 such labelings, determined by the choice of any label at the root and any of the three remaining smallest values at its left child:

The fifteen labeled binary trees on five elements in which, at each node, all left descendants have labels smaller than all right descendants

More generally, for any \( n \)-node tree \( T \), say that a labeling of the nodes of \( T \) by the numbers from 1 to \( n \) is \( T \)-heap-ordered if, for any ancestor-descendant pair \( (x,y) \), we have that \( x\lt y \). And say that a labeling is \( T \)-almost-sorted if, for any pair \( (x,y) \) that are not in an ancestor-descendant relation, with \( x \) to the left of \( y \), we have that \( x\lt y \). Then the product of the number of \( T \)-heap-ordered labelings and the number of \( T \)-almost-sorted labelings always equals \( n! \).

We can take this one step farther, from trees to permutations. For any permutation \( \pi \) of the numbers from \( 1 \) to \( n \), consider the two-dimensional set of points of the form \( (i,\pi(i)) \), and consider the labelings of these points by the numbers from \( 1 \) to \( n \). Define a labeling to be \( \pi \)-upward if, for every two points oriented from southwest to northeast, the southwest point has a smaller label than the northeast point. And define a labeling to be \( \pi \)-downward if, for every two points oriented from northwest to southeast, the northwest point has a smaller label than the northeast point. Do the numbers of \( \pi \)-upward and \( \pi \)-downward labelings have a nice product formula? Not always. For instance, the permutaton 2413 has five \( \pi \)-upward labelings (below) and symmetrically five \( \pi \)-downward labelings. The product of these counts, 25, differs from \( n! = 24 \).

The five upward labelings of the permutation 2413

However in some sense this is the only possible counterexample. For, when \( \pi \) is a separable permutation (a permutation that avoids both 2413 and its mirror image 3142 as patterns) then the numbers of \( \pi \)-upward and \( \pi \)-downward labelings always multiply to \( n! \).

Proof sketch: in this case the dot pattern for \( \pi \) can be broken into the dot patterns for two smaller separable permutations, either southwest-northeast of each other (the direct sum of the two smaller permutations) or northwest-southeast (the skew sum). Suppose the sizes of the smaller permutations are \( k \) and \( n-k \), and suppose they're combined in a direct sum. Then by induction the product of the numbers of labelings in the two smaller dot patterns is \( k! \) and \( (n-k)! \). Two \( \pi \)-upward labelings on the two smaller dot patterns can only be combined in one way to make a \( \pi \)-upward labeling on the whole pattern: every label in the southwest pattern has to be less than every label in the northeast pattern. However, two \( \pi \)-downward labelings on the two smaller dot patterns can be combined in many ways to make a \( \pi \)-downward labeling on the whole dot pattern: we can partition the \( \pi \) labels among the two patterns arbitrarily, and there are \( \tbinom{n}{k} \) partitions to choose among. So the number of pairs of an upward and downward label on the whole pattern is the product of the numbers of pairs labelings on the two smaller patterns with the number of ways we can make this combination:

\[ k! \times (n-k)! \times \binom{n}{k} = n!. \]

The case of a skew sum follows by a symmetric argument.

More strongly, this argument shows that any labeling of \( \pi \) can be generated uniquely from the trivial labeling (the one that labels each point by its \( x \)-coordinate) by a sequence of riffle shuffles, working bottom up in the recursive structure of \( \pi \) as a skew sum or direct sum of smaller permutations. At each skew sum or direct sum in this decomposition, combining smaller permutations of sizes \( a \) and \( b \), we choose one of the \( \tbinom{a+b}{a} \) possible riffles and apply it to the labels on the dot patterns of these two smaller permutations. If we do all of the riffles, we get an arbitrary labeling. But if we do only the riffles at skew sums, we get a \( \pi \)-upward labeling, and if we do only the riffles at direct sums, we get a \( \pi \)-downward labeling. So this shows that an arbitrary labeling of \( \pi \) can be decomposed into a unique pair of an upward and downward labeling.

For every tree \( T \), there is a drawing of \( T \) as the dot pattern of a separable permutation \( \pi \) such that the \( \pi \)-upward labelings are the same as the \( T \)-heap-ordered labelings and the \( \pi \)-downward labelings are the same as the \( T \)-almost-sorted labelings. One can construct \( \pi \) as the direct sum of the root of \( T \) (represented by a one-element permutation) with the skew sum of the subtrees of \( T \) (represented recursively). For instance, the complete binary tree used for 5-element binary heaps is represented in this way by the permutation 13542:

A binary tree and its representation as the dot pattern of a separable permutation

So, the product formula for labelings of trees is a special case of the product formula for separable permutations.