Bit tricks with cuckoo filters

I was pleased to discover that the proceedings of SIGMOD and PODS have been made open-access through their conference websites and the ACM “authorize” feature. It’s not as good as making the official publisher page for the papers open access — you can’t link to individual papers directly this way — but still pretty good.

One of the PODS papers is mine, my first at PODS. Go to the PODS link above and look for the link to “2-3 Cuckoo Filters for Faster Triangle Listing and Set Intersection” (with Mike Goodrich, Michael Mitzenmacher, and Manny Torres). We intend to make an arXiv version eventually, but as a merger with some other stuff including a related brief announcement at SPAA with Goodrich, “Using Multi-Level Parallelism and 2-3 Cuckoo Filters for Faster Set Intersection Queries and Sparse Boolean Matrix Multiplication”, so for now only the PODS version is available.

As the title(s) imply, this is another application of cuckoo filters, the subject of my paper from SWAT last year. As you may already know, a cuckoo hash table has a constant number of locations for each of its key-value pairs, and moves keys from one location to another if necessary to make room for new keys. In one particularly space-efficient version of this, each key has exactly two locations it might be found at, but each location can hold more than one key (but a constant number of keys). The more locations per key there are, the closer to 100% full the data structure can be made. A cuckoo filter is the same, but stores short fingerprints of each key instead of the whole key or a key-value pair. It provides an approximate set membership data structure (like a Bloom filter), where you can test membership by checking whether the fingerprint for a key is in one of the locations where it should be. No-answers are always correct, but a yes-answer might be incorrect (some other key may have left the same fingerprint in the same place) with a small and controllable false positive rate.

The main insight in the new paper is that, when the fingerprints are small enough, you can pack several of them into a binary word, and this packing allows you to compare two cuckoo filters and find the positions where they have the same fingerprint in the same place. With the two-location cuckoo filters, this wouldn’t give any useful information (the same key might be stored in both filters, but in different locations, so you wouldn’t notice the overlap). But a variant of cuckoo hashing called 2-3 cuckoo hashing, where each key has three possible locations and is stored in two of them, can be used to ensure that the two cuckoo filters we’re comparing do always have overlapping locations for each shared key. (This variant for cuckoo hashing, not filtering, was introduced as the “batmap” by Amossen and Pagh, “A new data layout for set intersection on GPUs”, IPDPS 2011.)

Based on this idea, we can find the intersection of two sets of cardinality \(d,\) represented as cuckoo hash tables with corresponding cuckoo hash filters, on a machine of word size \(w\) and with intersection size \(k,\) in time \(O(k+(d\log w)/w),\) giving a bit-parallel speedup of \((\log w)/w.\) The same speedup applies to other related problems such as finding or listing triangles in \(d\)-degenerate graphs. The idea is to use bit-parallel programming techniques to find matching fingerprints in the cuckoo filter, and then to check those positions for whether they are an actual match using the cuckoo hash table. The \(\log w\) factor in the runtime comes from choosing the fingerprint size appropriately, to make the false positive rate low enough so that the contribution to the runtime from checking false positives is dominated by the other parts of the algorithm. This result shaves a \(\log w\) time factor over a previous bit-parallel set intersection algorithm of Kopelowitz, Petie, and Porat, “Dynamic set intersection”, WADS 2015, which used other ideas (bit-parallel sorting of lists of fingerprints rather than cuckoo filters). Since \(w\) is itself likely to be logarithmic in the input size, really this is shaving a \(\log\log\) factor from the running time. Nevertheless, we also ran some experiments showing that this technique can lead to practical improvements on triangle-finding for some (but not all) graphs.

After this paper was published, I heard from Mike Rosulek at Oregon State (where I was visiting last week) of a paper extending the 2-3 cuckoo hashing idea in a different but related direction, for private set intersection. I think the reference is “Linear size circuit-based PSI via two-dimensional cuckoo hashing” by Benny Pinkas, Thomas Schneider, Christian Weinert, and Udi Wieder, a not-yet-published manuscript that Pinkas described in a talk last April at TPMPC 2017. The idea is to store each fingerprint or key in two out of four locations, instead of two out of three. Suppose set \(X\) stores its fingerprints either in north and east or in south and west, while set \(Y\) stores its fingerprints either in north and west or in south and east. Then, when \(X\) and \(Y\) have a key in common, it is guaranteed to appear exactly once in a shared location, whereas 2-3 cuckoo hashing would allow a common key to appear either once or twice in shared locations. The fact that each key in the intersection of the two sets is found only once in the intersection algorithm helps simplify the cryptographic parts of the private set intersection.

(Google+ posting and discussion thread)