Project 3: Knapsack

Due: Monday, November 2, 1998

Objective:

The objectives of this project are:

1. to practice thinking about recursion,

2. practice using two-dimensional arrays, and

3. more practice with C++.

Background:

Consider the following scenario for the knapsack problem. A thief breaks into a jewelry store. He brought a knapsack with him to carry his loot, but the knapsack can only hold so much weight. Fortunately, the jeweler has carefully labeled each item in the store with its weight and value. The thief's problem is to select which items to put into his knapsack. He wants to maximize the value of the items he will carry away, but he cannot exceed the capacity of the knapsack.

The knapsack problem can be solved by recursion. Suppose that our knapsack has capacity K and that there are n items numbered 1 through n. The ith item has weight wi and value vi. Now, consider item n. Suppose that wn is less than or equal to K. (If wn is greater than K, then item n does not fit in the knapsack and we know that we must leave it behind.) We can either put item n into the knapsack or we can leave item n behind. How can we tell which choice is better? We should follow the choice that maximizes the value of the items we can pack in the knapsack. This can be determined by answering the following two problems:

1. Suppose we have a knapsack with capacity K-wn. What is the maximum value of the items that we can put into such a knapsack (without exceeding its capacity), if we could only choose from items numbered 1 through n - 1?

2. Suppose we have a knapsack with capacity K. What is the maximum value of the items that we can put into such a knapsack (without exceeding its capacity), if we could only choose from items numbered 1 through n-1?
Ignore, for the moment how we intend to answer qestions 1 and 2. Perhaps, we do it by brute force, it doesn't matter. Just suppose that the answer to the first question is a and the answer to the second question is b. Then, if we put item n into the knapsack, the biggest value we can hope to carry away with us is vn + a. On the other hand, if we leave item n behind, then the most we can have in our knapsack is b. Now, if we simply compared vn + a with b, we would know whether to take item n or leave it behind. Of course, when we solved questions 1 and 2, we have also determined how to pack the knapsack to get values vn + a and b, so we have really solved the entire problem.

So, how do we solve questions 1 and 2? The answer is, of course, "recursively". Questions 1 and 2 are instances of the knapsack problem, so we can use the same process to solve these questions. There are two base cases for this recursion. First, we might have a knapsack with zero capacity. In that case, we know that we cannot put any items in the knapsack. Second, we might have a list of zero items to consider. In this base case, we also know that we cannot put any items in the knapsack.This recursive algorithm can be slow because each call to a recursive function implementing this algorithm could generate two additional recursive calls. However, using a memoization table does speed up the algorithm.

This recursive strategy for the knapsack problem described above is also known as "dynamic programming". This terminology comes from the field of Operations Research. The word "programming" in "dynamic programming" is not used in the computer science sense and has nothing to do with writing code.

Example:

Consider the example where we have a knapsack with capacity 6 and the following 5 items:
```	Item 1: weight =  5 value =  1
Item 2: weight =  3 value =  3
Item 3: weight =  1 value =  2
Item 4: weight =  4 value =  1
Item 5: weight =  3 value =  5
```
Suppose that we take item 5. Since item 5 has weight 3, we have a remaining capacity of 3 in the knapsack. So, what is the best way to pack items 1 - 4 in the remaining capacity? Well, we can't take item 1 or item 4, since they have weight greater than 3. So, the best we can do is to take item 2 which has weight 3 and value 3. Thus, if we take item 5, the best we can do is to pack a knapsack with items 2 and 5 with a total value of 8.

Conversely, suppose that we don't take item 5. Since we did not use any space in the knapsack, it still has capacity 6. The best way to pack items 1-4 in the knapsack is to take items 2 and 3 (try all possibilities). This gives us a total value of 5, if we do not take item 5.

Comparing the two alternatives, we conclude that we should take item 5 since that gives us a more valuable knapsack.

How would a memoization table help? Consider an example where the capacity of the knapsack is 27 and the last 3 items have weights 2, 3 and 1. In one sequence of choices, we might decide to take the item with weight 3 and leave the other two behind. In another sequence of choices we might decide to take the items with weight 1 and 2. In either case, we would want to know the best way to pack the remaining n - 3 items in a knapsack with capacity 24. Using a memoization table guarantees that we would solve this subproblem only once. The next time we could simply look up the answer. Using a memoization table greatly reduces the running time of our recursive function.

Assignment:

Your assignment is to implement the recursive strategy for the knapsack problem described above. Your implementation must include a recursive function. Furthermore, you must implement a memoization table to speed up the running time of your recursive function. For this problem the memoization table is a two dimensional array. This array is indexed by the capacity of the knapsack and the index of the last item we are allowed to consider.

Use the UNIX time command to obtain the running time of your program with and without a memoization table. You should notice a big difference in the running time when the number of items is between 15 and 20.

Your program should take a command line argument n which specifies the number of items in the knapsack problem. For testing purposes, you should generate your test data randomly using the drand48() and srand48() functions. You should set the random seed either using command line input or using the system clock, so each run of your program would work with different data. For this project, use a knapsack with capacity of 0.25 n2. Each item should have a random integer weight between 1 and n. Each item should have a random integer value between 1 and n. At the end of each run, print out the list of items placed in the knapsack and the total value of the items.

Implementation Issues:

1. C/C++ indexes an array with n elements from 0 to n-1. If this is too confusing for you, just declare an array with n+1 elements and don't use the element with index 0.

2. The recursive function that solves the knapsack problem should return an answer that tells us how to pack the knapsack --- i.e., which items to bring and which items to leave behind. The easiest way to do this is to return an integer array with n elements. The ith element of the array is 1 if we should take item i and 0 otherwise. For the recursive calls that only consider items 1 through some index m, it is simpler to have these functions also return an array with n elements. You can find suggested class declaration for this project in the file:

/afs/umbc.edu/users/c/h/chang/pub/cs202/project3/knapsack.h

You may change these declarations if you wish.

3. When you initialize the memoization table, you need to use a loop and allocate memory for each row of the two-dimensional array.

What to turn in:

Turn in all the files that are needed to compile your program, including knapsack.h if you use it. As usual include a README file that tells the grader how to compile your program. A typescript file should be included with a record of your sample runs with timing information.

Extra Credit:

It very tempting to solve the knapsack problem using a "greedy" approach. For example, the thief in the jewelry store may very well decide to take as many diamonds as possible, because diamonds are light and very valuable --- worth more than their weight in gold. One could calculate the value to weight ratio of each item and sort the items according to this ratio. Then, one would take the items in order of decreasing ratio until the knapsack is full.

For 10 extra points, implement this greedy strategy and convince yourself that it does not provide the optimum solution to the knapsack problem. Your submission should include several sample runs comparing the greedy algorithm and the dynamic programming algorithm.

As usual, extra credit is "all or nothing" --- you either get all 10 points or none of it.