Monday, October 28, 2013

Calculating Integer Partitions

This post can serve as a nice interview question. Because some problems can be deceptively simple. Knowing the formula MAY NOT be enough to write an algorithm to compute it. Allow me to explain using the example of Integer Partitions. If you want the explanation on how this formula is derived, you can read it here.

To recap, Integer Partitions of a number, are ways the number can be expressed as sum of smaller numbers. And, P(N), the number of integer partitions of a positive integer N can be expressed as
P(N) = PP(N,1) = PP(N-1, 1) + PP(N, 2)
Where PP(N,k) is the number of partitions in which the smallest addend is at least k. And
PP(N,k) = 0 if N < k
PP(N,k) = 1 if N = k
otherwise
PP(N,k) = PP(N-k,k) + PP(N,k+1)

Here is how it can serve as an interview question to test the programming skills. The “natural” recursive solution can be easily implemented as follows.

public static int recursivePP (int N, int k) {
    if ( N <= 0 ) return 0 ;
    if ( N < k ) return 0 ;
    if ( N == k ) return 1 ;
    return recursivePP(N-k, k) + recursivePP(N, k+1) ;
}

This is grossly inefficient. Because intermediate results are unnecessarily computed multiple times. Every time the recursion reaches a particular pair (a,b) it will issue another recursion and calculate it all over again. So the code is natural, but not smart. It would take forever to compute say P(100) this way. What’s needed is a solution based on dynamic programming that builds P(i) as i goes from 1 to N. To do that we need to store the table of values for P(N,k).

Consider the following code.
public static long[][] PPvalues = new long[100][100] ;

public static long iterativePP (int N) {

    if ( N <= 0 ) return 0 ;

    // init P(N,k) for N=1, k=1
    PPvalues[1][1] = 1 ;

    for (int i = 1 ; i <= N ; i++) {
       
    // mark PP(N,k) where N == k
    PPvalues[i][i] = 1 ;

    for (int j = i-1 ; j > 0 ; j--) {

        long pp1 = ((i-j) < j) ? 0 : PPvalues[i-j][j] ;
        long pp2 = (i < (j+1)) ? 0 : PPvalues[i][j+1] ;
        PPvalues[i][j] = pp1 + pp2 ;

    } //
    } //

    return PPvalues[N][1] ;
}

Now this performs incredibly faster than the recursive solution. O(N^2) is not ideal, but it’s not bad either. This answer should usually be enough for an interview question.

Unfortunately, it’s not enough for finding out the value of P(N) for large values of N. The reason is the memory complexity of the algorithm which is also O(N^2), and our computers have a very finite memory.  For example, even with 8G of RAM I could not calculate the value of P(10000) because there is simply not enough memory to allocate the table.

That was a real bummer. There is no quick and easy way to optimize this to take less memory. Using a better data structure to store only half the table, as P(N,k) = 0 for N < k, of course won’t help much, only by a factor of 2.

One possibility is to compute only a section of the table at any given time. Note that the formula requires only the current column and the previous column, to calculate the value of any cell. You can start from last 2 (rightmost) columns, and keep computing the table leftwards. So you have to store only 2 columns (current and previous), thus significantly reducing memory. That’s a good optimization. The resulting code will be too complex for an interview question though.

Fortunately, there is a better formula for coming up with an algorithm to calculate integer partitions. It is based on pentagonal numbers. But that’s a math question, not a programming question. I will explain that in another blog post.

No comments:

Post a Comment

Related Posts Plugin for WordPress, Blogger...