Recursive bijective numbering

Ordinary radix systems for representing numbers, like binary or decimal, use fixed finite sets of digits. The same is true for bijective numeration, used for columns of spreadsheets. In bijective numbering, every string of digits represents a unique number, shorter strings represent smaller numbers, and strings of equal length have the usual order. In spreadsheets, the digits are letters A–Z, one-letter strings represent numbers \(1\) through \(26\), two-letter strings represent \(27\) through \(26+26^2=702\), etc. Instead, mixed radix systems like the factorial number system have position-varying digit sets. In factorial numbering, the \(i\)th digit can range from \(0\) to \(i-1\), with place value \((i-1)!\). The value of a digit string is the sum of digits times place values. One could combine these ideas and get a bijective factorial number system, in which strings of at most \(i\) digits can represent numbers up to the sum of the first \(i\) factorials. But in all of these mixed radix systems, the digits are not drawn from a finite set. Maybe we need another numbering system just to describe each digit of a mixed-radix number? This leads to the idea of a recursive numbering system, where the digits of a mixed-radix system are themselves numbers in the same system.

An example might clarify this idea. Let’s start with a recursive numbering system based on binary. Like other mixed-radix systems, we’ll number the digit positions starting with \(i=0\) at the right. Each position has a place value, the product of the numbers of choices for earlier digits, and the value of a sequence of digits is the sum of digits times place value. We’ll start with two choices of digit in position \(0\), like binary, and in later positions we’ll use the values we’ve already constructed as allowable digits.

In position \(i=0\), the two available digits given in the base case are \(0\) and \(1\), allowing us to represent only the numbers \(0\) and \(1\) as one-digit strings.
In position \(i=1\), there are still only two digits available, the two numbers \(0\) and \(1\) that have shorter representations. In this position, they have place value \(2\), allowing any number from \(0\) to \(3\) to be represented as a two-digit string.
In position \(i=2\), we now have four digits available, the numbers from \(0\) to \(3\). These have place value \(4\), allowing any number from \(0\) to \(15\) to be represented with three digits.

At each successive level of this structure, the number of available digits and place values will be equal. Each new digit squares the range of representable numbers. In any position \(i>0\), we will have \(2^{2^{i-1}}\) digits available, with place value \(2^{2^{i-1}}\), allowing any number from \(0\) to \(2^{2^{i}}-1\) to be represented. In some sense, though, we haven’t done anything very interesting yet. Like binary, we just get powers of two. The recursive-binary representation of a number can be obtained in a trivial way from its binary representation, by partitioning the bits into groups of 1, 1, 2, 4, 8, … bits.

Instead, a bijective numbering system leads to more interesting results when handled recursively. For recursive bijective numbering, we can start with a much simpler base case, the empty string, which can be interpreted in any bijective numbering system as representing the number \(0\). With this base case, we can define a bijective numbering system in which the available digits in position \(i\) (starting from \(i=0\) in the rightmost digit) are the numbers that can be represented by strings of length at most \(i\):

In position \(i=0\), there is only one available digit, the number \(0\) (already represented by the empty string), producing only one single-digit value, representing the number \(1\).
In position \(i=1\), there are two available digits, the numbers \(0\) and \(1\) that we have already represented, so there are two two-digit sequences possible, representing two more numbers \(2\) and \(3\).
In position \(i=2\), there are four available digits, providing eight three-digit sequences that can represent the numbers \(4\) through \(11\).

If we let \(n_i\) denote the number of values that can be obtained by strings of exactly \(i\) digits, then we obtain a recurrence \(n_0=1\) (the zero-length strings represent only one value, zero) and \(n_i=n_{i-1}\sum_{j<i}n_j\) (there are \(n_{i-1}\) choices for the string up to its last digit, and \(\sum_{j<i}n_j\) choices for the last digit). This recurrence has the sequence of values 1, 1, 2, 8, 96, 10368, 108615168, … and its partial sums 1, 2, 4, 12, 108, 10476, 108625644, … give the total number of values that can be represented by strings of up to \(i\) digits.

The same idea can be represented graphically. Given a number represented by this recursive bijective numbering system, draw an ordered tree, where an \(i\)-digit number is represented by a tree whose root has \(i\) children, one for each digit, and where each digit is represented in the same way by a subtree. Conversely, if we have an ordered tree, we can think of it as representing a recursive bijective number, whose digits are computed recursively from the children of the root. In our recursive bijective numbers, we required that each digit be representable as a string of length at most its position; translating this requirement into tree terminology, each node that is the \(i\)th child of its parent (again starting with \(i=0\) on the right) must have degree at most \(i\).

Equivalence between recursive bijective numbering and tree enumeration

It follows that the sequences of numbers of representable values for \(i\)-digit strings and for \(\le i\)-digit strings, above, also give the numbers of ordered trees that obey this degree constraint, with root degree \(i\) or \(\le i\), respectively.

(Discuss on Mastodon)