Did you ever wonder why different states of the US have the numbers of representatives in congress that they do? It's supposed to be proportional to population but that's not actually true: for instance the ratio of representatives to population is about 40% higher in Montana than California. What formula or algorithm do they use to pick the numbers?

This has varied over the years but, Wikipedia tells me, currently it's the Huntington–Hill method. One way of describing this is by a simple but inefficient algorithm, with the following steps:

  1. Give each state a representative, since they all have to have at least one.

  2. Repeatedly, until there are the desired number of total representatives, prioritize the states by \( \frac{\mathrm{population}}{\sqrt{\mathrm{reps}(\mathrm{reps}+1)}} \) and give one more representative to the state with the biggest priority.

The problem of assigning seats to parties after a parliamentary election is very similar, using votes instead of population, but in that case it's ok for some parties to get zero seats. This causes the formulas that are used for prioritizing the parties to use different divisors, typically linear functions of the number of seats already assigned rather than the square root thing used here. This general type of apportionment method is called a highest averages method.

The question asked by my most recent preprint ("Linear-time Algorithms for Proportional Apportionment", arXiv:1409.2603, with Jack Cheng, to appear at ISAAC 2014) is: how quickly can you assign seats using these methods? Using the procedure described above, with a priority queue to do the prioritization, would take an amount of time slightly superlinear in the number of seats. But it turns out we can do quite a bit better: linear in the number of parties or states getting the seats. Probably this doesn't matter for actual elections, the slow part of which is collecting all the votes. But it might be useful if you want to run a lot of simulated elections, or to use apportionment algorithms for problems where the number of things being apportioned is much larger than the number of congressional representatives.

There's also a nice way of viewing these problems more abstractly: suppose we have \( n \) different infinite arithmetic progressions, and an input parameter \( k. \) How quickly can we find the \( k \)th smallest value in the disjoint union of the progressions? Answer: \( O(n) \) arithmetic operations, independently of \( k. \) For the parliamentary apportionment problem, you get these sequences by turning the priorities upside down, with the linear function of the number of representatives as the numerator and the number of votes as the denominator. For the congressional problem, this gives something that is not exactly an arithmetic progression, but it's close enough to one that the same algorithms work with only minor modification.

Incidentally, there's a footnote on p. 3 of the preprint about two seemingly very relevant references, in Japanese, whose titles claim that they give linear time algorithms for related problems. Unfortunately despite attempts to contact both the authors of these references and the reviewer who used them as a reason to downvote our paper, we have been unable to obtain them nor even to verify that they actually exist, let alone to determine which variable their time is linear in and which apportionment methods they apply to. So we don't actually know whether our algorithms or results are really new. If anyone reading this has better access to these sources, we'd appreciate any help you could give us. ETA: I now have a copy of the IEICE Trans. D one, but haven't yet examined it. Apparently part of the difficulty is that there are two different IEICE Trans. D.'s.





Comments:

chaoxu:
2014-09-18T02:24:49Z

For theorem 2, if you go for a slightly weaker version, where you can find the rank \( k \) element in \( A \) in constant time for each increasing sequence \( A, \) it seems you can reduce it to Fredrickson and Johnson's selection in matrices with sorted columns. This would save a page of appendix.

A coarse solution \( x \) smaller than the \( k \)th ranked element allows you to remove all elements smaller than \( x. \) Also, there is no point of keep more than \( O(n) \) element after the coarse solution \( x \) in each arithmetic sequence. So the problem reduces to search for \( (k-\mathrm{rank}(x) \))th element in \( n\times O(n) \) implicit matrix, with the columns sorted. Because \( (k-\mathrm{rank}(a))=O(n), \) the algorithm takes \( O(n) \) time using Fredrickson and Johnson's algorithm.