The length of a 2048 game

Presumably many readers have seen and played the 2048 game, in which one slides tiles in different directions across a grid, causing certain pairs of tiles to combine with each other when their sum is correct (in this case, when the sum is a power of two):

Screenshot of the 2048 game

Or if you get bored with it there are many other variations, some of which have different combining rules, for instance there's one where tiles only combine when their sum is a Fibonacci number. I have the impression this all started with another of these games, threes, which you would think allows tiles to combine when their sum is a power of three or twice a power of three (the ternary digits), but actually uses a different set, the numbers 1 or 2, and the numbers that are three times a power of two.

At some point I switched from 2048 to 987 (the Fibonacci one) and noticed that the games were lasting longer. Or alternatively, I could reach higher tile values in 987 more easily despite the fact that it takes twice as long (the 987 game gives you 1's for each new tile, instead of 2's and 4's for 2048). Why is that?

It's certainly not my skill or strategy; I think I'm a worse player at 987 than at 2048. I think there are actually two reasons why the games last longer. One is that the Fibonacci combining rule makes it less critical that the tiles be ordered a certain way, because if they're a little bit out of order they can still combine with each other. But the more important reason has to do with the mathematical limits on the length of a game.

In 2048, generalized to arbitrary board sizes, the maximum tile you could possibly reach on a board with \( n \) cells is approximately \( 2^n. \) That's because, to reach a tile with value \( 2^{n+1}, \) you'd first have to get to a board whose total tile value was \( 2^{n+1}-2 \) (or maybe if you get lucky \( 2^{n+1}-4 \)) and those numbers can only be expressed as a sum of powers of two by using many tiles (the number of nonzeros in their binary representation). In 987, there are also total tile values that require many tiles to express as a sum of Fibonacci numbers, but they're more widely spaced: they are formed by taking every other Fibonacci number (\( 2, 5, 13, 34, 89, \) etc) and subtracting one to get \( 1, 4, 12, 33, 88, \) etc. These numbers grow roughly as \( 2.618^n, \) so that's how big a tile value you could hope to get on a generalized 987 board. Another way of thinking about this is that we are counting the number of nonzeros in the Zeckendorf representation of the total tile value. The Zeckendorf representation is a number system that is based on Fibonacci numbers, which grow more slowly than powers of two, but only half of the bits of a number can be nonzero, so it takes higher numbers to get a lot of nonzero bits.

This made me wonder: what about variants of 2048/987/etc that allow pairs of tiles to combine when the sum belongs to some other sequence of numbers, rather than powers of two of Fibonacci numbers? The sequence that Threes should have used, the ternary digits, would allow you to reach even larger tile values: the numbers that are hardest to express as a sum of ternary digits are of the form \( (3^n-1)/2, \) so again they grow exponentially but with a bigger base. Or what about if you allow tiles to combine whenever they are equal or a factor of two apart, so that the allowed tile values are the 3-smooth numbers? Then, the sequence of hard-to-express totals grows much more quickly, as \( 1, 5, 23, 431, 18431, 3448733, 1441896119 \) (OEIS A018899). These numbers look to me to be growing exponentially in \( n^2 \) but I don't know why that should be.

The primes don't make a good sequence of allowed tile values for games like this, because of parity issues (two odd primes can't add up to another prime). But what about the semiprimes? According to Chen's theorem, every sufficiently large number can be decomposed into a sum of two primes or semiprimes, so there aren't any totals that could only be expressed using many tiles. Does this mean the game can go on forever?

No! The semiprimes still have density zero in the set of all integers: the fraction of integers up to some number \( N \) that are semiprime goes to zero in the limit as \( N \) goes to infinity. And it turns out that every set of allowed tile values that allows the game to reach arbitrarily large tiles on a board of finitely many cells must have positive density. This is true even if you allow a more relaxed set of moves that ignores the board geometry and lets you combine arbitrary pairs of tiles no matter where they are on the board. The proof is by induction on the number of board cells. If a set \( S \) of allowable tile values lets the game reach arbitrarily large tile values on \( n-1 \) cells, then the induction hypothesis shows that \( S \) must have positive density. Otherwise, let \( M \) be the maximum tile value reachable on \( n-1 \) cells. Suppose that an \( n \)-cell board eventually reaches a position that includes a tile of value greater than \( M \), and mark all the nonzero tiles at that point in time. Then the remaining game can be partitioned into moves on unmarked tiles, forming a "small game" on a board of at most \( n-1 \) cells, together with moves that either merge two of the marked tiles or add a value from the small game to one of the marked tiles. (This removes a tile from the small game but every position reachable in this way could also be reachable in a small game without tile removal.) Only a finite number of merges are possible, so to reach arbitrarily large values it must be possible to add small-game values to marked tiles arbitrarily many times. But this means that the set S must have density at least \( 1/M, \) for otherwise there would be infinitely many gaps of length larger than \( M, \) too wide to cross by adding a small game value and too many to cross by merges.

Therefore, the sequence of largest reachable tile values for the semiprime game on an \( n \)-cell board is a bona fide infinite sequence: there is no board that lets you play forever. What is that sequence and how quickly does it grow? I have no idea, but presumably much more quickly than exponential in \( n^2 \).

Comments:

None:
2014-11-09T23:23:06Z

Regarding the sequence OEIS A018899, I think it grows at most like \( n^{O(n)} \), which is smaller than exponential in \( n^2 \). To see why, first it's easy to see that the number of 3-smooth integers less than or equal to \( M \) is about \( (\log M)^2. \) So the sum of \( n \) such integers has at most around \( (\log M)^{2n} \) possibilities. But that number of possibilities is less than \( M \) provided that say \( M = n^{4n} \) (and \( n \) is sufficiently large). So some positive integer less than or equal to \( M \) can't be written as the sum of \( n \) 3-smooth numbers.

-- Ravi Boppana

11011110:
2014-11-10T00:07:37Z

That makes more sense, thanks. Of course, it's only an upper bound; we still don't have a lower bound showing that all sufficiently smaller numbers can be written as a sum in this way.

None:
2014-11-10T02:10:03Z

Right, my argument above was only an upper bound. Reading the reference in the OEIS entry, it looks like \( n^{\Omega(n)} \) is also a lower bound. Equivalently, every integer from \( 1 \) to \( M \) can be written as the sum of \( O(\log M/ \log \log M) \) 3-smooth numbers. The idea is that there is a 3-smooth integer just a little less than \( M \). Namely, a result of Tijdeman says that there is a 3-smooth number in between \( M - M/(\log M)^c \) and \( M \) (for some positive constant \( c \)). We can then iterate on the difference between \( M \) and this 3-smooth number. It's basically a greedy algorithm.
--Ravi