Needles in haystacks: shared factors among many large composite numbers

The recent discovery that bad key-generation procedures may have caused thousands of RSA keys to share prime factors has led me to wonder: how would one go about finding these common factors? If one assumes there are \( n \) \( b \)-bit keys, and that computing the product or gcd of two \( b \)-bit numbers takes time within low-order factors of \( O(b), \) then testing all pairs of keys individually takes time \( O(n^2b), \) but that seems like a lot if \( n \) is big (and in this test it was big, around 6 million). Instead, multiplying groups of keys together and then taking the gcds of the product should allow you to test many pairs at once. What is the optimal strategy?

At first I thought that this would be like some sort of group testing problem, in which one wants to choose the groups of keys to multiply together carefully in order to ensure that any common factor is isolated into a single test. Because if you multiply some keys together into groups, take their gcd, and it has more than one prime factor, you haven't found the faulty keys or their factors, right?

Wrong. The problem is actually much simpler than group testing, because if you find a nontrivial gcd \( G, \) it doesn't matter that it's a product of several shared primes: you can still go back and test it against the keys you started with to see which ones of them have a factor in common with G.

In more detail: perform \( log_2 n \) tests; in test \( i, \) multiply together on one side the keys whose index has a zero in the \( i \)th bit, and multiply together on the other side the keys whose index has a one in that bit. Take the gcd of the two sides. If there are \( k \) keys with shared primes, this will be a number \( G_i \) with \( O(kb) \) bits. Every prime factor of \( G_i \) is shared by at least two keys, and every shared prime factor will show up in at least one of these numbers \( G_i. \)

Now place the keys at the leaves of a balanced binary tree, compute products of descendants at all intermediate nodes of the tree, and trace from the root of the tree downwards finding all the nodes that have a nontrivial gcd with \( G_i. \) At each step downwards through the tree, replace \( G_i \) with its gcd with the product stored at the tree node, and use that replacement value in place of \( G_i \) at the two child nodes, in order to ensure that the computation at each node is only proportional to the number of bits in the product stored at that node. Ignoring low order factors like \( \log n, \) the total time is \( O(nb), \) because that's how long it takes to compute \( G_i, \) and that's how much time it takes per level of the binary tree to compute the gcd's of \( G_i \) with the products stored within the tree.

So, it's a bit more complicated than the naive all-pairs test, but it saves a factor of \( n. \) Of course, in practice you'd have to be using fast integer multiplication algorithms (e.g. based on FFT) to make this worthwhile. And even then the fact that the naive test uses very little memory and the faster algorithm uses a lot (and therefore has to go outside the cache) might kill you...