Tuesday, May 1, 2012

Generating Prime Numbers Using Sieve of Eratosthenes

Generating a prime is occasionally needed in solving coding puzzles. It's a pretty straightforward task, whether you stick with a naive brute force solution or with "Sieve of Eratosthenes". I am going to use this one later to show a solution for a more involved problem.

The "Sieve of Eratosthenes" is a very old method of generating the primes, named after an ancient Greek Mathematician who invented it. It's intuitive to understand, and efficient.

You start with number 2, and mark all numbers up to N that are divisible by 2 as non-prime, by just hopping forward 2 places, till you reach N. Then you start with 3, and mark every third number as non-prime. Next number 4 is already marked as non-prime, so ignore it. Then start with 5, and mark every 5th number as non-prime. And so on.

A simple observation helps most algorithms related to checking/generating prime numbers.
For finding a factor of any number N, you only have to check using numbers that are less than the square root of N. Because if N = x * y, and x > sqrt(N), then it must be true that y < sqrt(N).
You can use this observations is 2 ways.
1. When you are considering the factors to cross out the composites, you need to consider factors only till sqrt(N).
2. When you start crossing out, you can start crossing out numbers from factor^2. Anything composite below that would be a smaller multiple of factor, and would have been crossed out.
Here is the Java code :

// consider all numbers as prime to begin with
// init array of N flags to false
boolean is_non_prime[] = new boolean [N+1] ;
// NOTE : first index is 0

is_non_prime [0] = is_non_prime [1] = true ;
// 0 and 1 are not considered primes

for (int factor = 2 ; factor*factor <= N ; factor ++) {
    // no need to find composites of a non-prime number
    if ( is_non_prime[factor] )
        continue ;

    for ( int composite = factor * factor ;
              composite <= N ; composite += factor ) {
        is_non_prime [composite] = true ;
    } //
} //

After this, the array is_non_prime[k] = false for any number K that is prime.

Of course, this needs memory of N bits for storing the flags, which is not an issue in most applications.

Finding the computational complexity is a bit tricky. There are 2 loops, but the complexity is not O(N^2). The outer loop is executed K times, where K=sqrt(N). The inner loop executed only for primes, which is significantly less than K for large values of K.

Note that for factor = 2, there are N/2 operations because N/2 numbers will be crossed out. Then for factor = 3, there will be N/3 operations. Then N/5 operations and so on. So
Total operations = N/2 + N/3 + N/5 + N/7 + N/11 + N/13 + ...
  = N * ( 1/2 + 1/3 + 1/5 + 1/7 + 1/11 + 1/13 + ... )
The term in the parenthesis is "prime harmonic series" . It has been shown to have the upper bound of ln ln (N). Where ln is the natural logarithm, and yes, it's a double logarithm : log of log of N.

This means the complexity of our algorithm is O (N * ln ln N ).

The double logarithm make the second term quite small than N. That makes the computation complexity of "Sieve of Eratosthenes" almost linear, which is quite a feat.

2 comments:

  1. Shouldn't this:
    for (int factor = 2 ; factor < Math.sqrt(N) ; factor ++)
    be
    for (int factor = 2 ; factor*factor <= N ; factor ++)

    otherwise for N=25, you will check only upto 4, thus wrongly thinking 25 is a prime number.

    ReplyDelete
    Replies
    1. You are correct. Thanks for catching the mistake, I have updated the code.

      Delete

Related Posts Plugin for WordPress, Blogger...