The Nth Prime Algorithm
A prime page by Andrew Booker
The Largest Known Primes Icon
  In order to find a prime quickly, the nth prime program uses a large stored data table to get close to the right answer first, then finishes with a relatively short computation.  To see how this works, imagine the number line broken into bins, each of size N, i.e. the first is from 0 to N-1, the second from N to 2N-1, etc.  (In my implementation, N equals 19219200.)  In the table we store the number of primes in each bin.  Then, to retrieve the nth prime, the program adds up the numbers until the sum exceeds n.  We then know exactly which bin the nth prime falls into.  The prime can then be found using a sieve algorithm on that bin.

The sieve algorithm works as follows.  First, we compute a set of base primes to be used in sieving.  The base primes consist of all the primes up to the square root of the last number in the bin.  Second, we keep an array (the sieve) which holds a flag byte for each number in the bin.  (Actually, in practice the array is much smaller than the size of a bin, so the bin is first broken into many sub-bins that are sieved one at a time.)  We then "sieve" the bin by crossing out (setting the flag byte for) the multiples of each base prime.  At the end, the numbers that were not crossed out (did not have their flag bytes set) are all the primes in the bin.  These primes are counted until reaching our original number, n.  (For a better description of sieve algorithms, see this entry in Chris Caldwell's Prime Glossary.)

Thus, if the bin size is small enough, the sieving can be done very quickly.  In this way, most of the work in finding a prime was done in computing the data table.  The table, in turn, was computed using a modified sieve algorithm that is well suited to sieving many bins.  The modified algorithm actually sieves many times, once for each residue relatively prime to some number with many small prime factors.  (I used 30030 = 2*3*5*7*11*13.)  So, for example, in the first sieve, all of the numbers have remainder 1 when divided by 30030, so instead of having the flag bytes represent 0,1,2,... as in the standard sieve algorithm, they represent 1,1+30030,1+2*30030,...  This may sound like it creates more work than before, but it turns out to be faster since we only need to consider remainders which are relatively prime to 30030.  In this way, the multiples of 2,3,5,7,11, and 13 are automatically eliminated, making the algorithm roughly 6 times faster.  What's more, the modified algorithm can be easily parallelized, as many computers can each work on separate residues.  The table used by the nth prime program was calculated over 9 hours by a lab of UltraSparcs and SGIs. 

Return to the Nth Prime Page