# Permutation supersequences and shortest paths

You may have heard of permutation superpatterns and superpermutations, but have you heard of permutation supersequences? They turn out to be closely related to how quickly we can find shortest paths in graphs with negative edge weights using the Bellman–Ford algorithm.

Superpatterns, superpermutations, and supersequences are all sequences of symbols that somehow contain each permutation of their symbols. But they use different definitions of “contain”, causing them to have different lengths. For supersequences, the goal is to have every permutation as a subsequence: its symbols must appear in the same order, but can be widely spaced. One way to create a supersequence of \(n\) symbols, but not the best way, is to repeat all of them \(n\) times: 1234123412341234. You can do a little better by alternating ascending and descending order, saving one symbol per repetition: 1234321234321. But this is still not optimal. According to Oliver Tan in “Skip letters for short supersequence of all permutations” (*Disc. Math.* 2022, doi:10.1016/j.disc.2022.113070), calculating the exact length of the shortest supersequence of \(n\) symbols, as a function of \(n\), is still an open problem. Tan finds sequences of length \(n^2-\tfrac52n+O(n^{2/3})\), and cites a bound of Kleitman and Kwiatkowski showing that they must have length \(n^2-O(n^{7/4})\). So we have pretty accurate bounds on how short they can be: near \(n^2\).

Now let’s continue to look at sequences and subsequences, but change up what they have to contain. Any permutation has \(n-1\) consecutive pairs of symbols. I want to find a sequence of pairs that for each permutation contains its \(n-1\) pairs, in their correct order. For instance, a four-symbol pair sequence should contain the three pairs 12, 23, and 34, in that order, coming from the permutation 1234, and it should also contain the other 23 triples of pairs coming from the other 23 permutations of the four symbols.

Why? Because that’s what some versions of the Bellman–Ford algorithm do! Suppose we want to find shortest paths from some starting vertex \(v_0\) to all other vertices in a complete directed graph. We can initialize tentative distances \(D[i]\) for all other vertices \(v_i\), \(i>0\) to be the lengths of the single edges from \(v_0\) to those vertices, and then repeatedly *relax* a pair \((i,j)\) by setting \(D[j]=\min\{D[j],D[i]+\operatorname{len}(v_i,v_j)\}\). If we relax all of the consecutive pairs of vertices along a shortest path, in order, this will cause the tentative distance for the endpoint of the path to equal its correct shortest path distance.

So one way to get a correct shortest path algorithm is simply to choose a supersequence of the consecutive pairs of all permutations, and update the tentative distances in this order. You might notice that such an algorithm doesn’t pay any attention to the results of its calculation; it just performs its pairwise calculations for the entire supersequence, in a set order. I’ll call an algorithm like this, that chooses its update steps ahead of time instead of taking account of what has happened so far, *non-adaptive*. As a sequential algorithm, it would be better to be a little smarter, and use past information to guide future steps, but non-adaptive versions of Bellman–Ford turn out to be widely used in contexts where there is no centralized control, such as distance vector routing on the internet.

So to figure out how efficient Bellman–Ford can be, we need to figure out how short these pair supersequences can be. Frustratingly, I don’t know very tight bounds for this problem. I can at least find out the exponent of the leading term in the length of these sequences (it’s cubic), but its constant factor eludes me.

As an upper bound, we can look at what various versions of the Bellman–Ford algorithm actually do. The simplest choice is just to repeat all pairs of indexes, \(n-1\) times: for four symbols (graphs with five vertices, but not counting the start), repeat 12–13–14–21–23–24–31–32–34–41–42–43 three times. For \(n\) symbols, this would require a total of \(\bigl(1-o(1)\bigr)n^3\) pairs. A technique of Yen improves this by listing the ascending pairs in ascending order and then the descending pairs in descending order: 12–13–14–23–24–34–43–42–32–41–31–21. Such a sequence can cover the pairs from any run of ascending symbols followed by any run of descending symbols: for instance, it would by itself cover the pairs from the permutations 1234, 1243, 1342, 1432, 2341, 2431, 3421, and 4321, which each consist of one ascending and one descending run. In general, you only need to repeat this ascending-descending pair sequence \(\lceil n/2\rceil\) times to cover all ascending and descending runs, giving pair supersequences of length \(\bigl(\tfrac12-o(1)\bigr)n^3\). And if you’re willing to tolerate a randomized algorithm that is correct only with high probability instead of with certainty, a paper of mine with Michael Bannister from 2012 randomly permutes the symbols before using Yen’s ascending-descending method to achieve length \(\bigl(\tfrac13+o(1)\bigr)n^3\).

My newest preprint, “Lower bounds for non-adaptive shortest path relaxation” (arXiv:2305.09230, to appear at WADS), as the title suggests, looks at the lower bound side of the same problem. It shows that deterministically chosen sequences of pairs must have length at least \(\bigl(\tfrac16-o(1)\bigr)n^3\) and that random sequences (correct with high probability) must have length at least \(\bigl(\tfrac1{12}-o(1)\bigr)n^3\). It also contains analogous bounds for graphs that are not complete. However, these lower bounds are still far from the upper bounds. They are not even close enough to tell me whether randomized Bellman–Ford is really better than deterministic Bellman–Ford (in this non-adaptive setting) or whether a better deterministic pair sequence would beat the random one from my 2012 paper.