CS 337: Lab 13 -- the last lab

Greedy Bin Packing

In this lab you will experiment with several algorithms for "bin packing". Informally, the bin packing problem is a follows: you are given a set of packages each with some weight Wi. You are also given a set of bins each of which have a maximum capacity Cj. The task is to assign the packages to bins so that the number of bins used is minimized. In the usual formulation, and the one you will be using in this lab, the capacity of all of the bins is identical (ie. Cj=Cj-1 for all j>=1).

Finding the optimal solution to this problem is NP complete. That is the optimal solution can only be found by exhaustively trying every possible combination of assignments of packages to bins. Fortunately there are a lot of heuristics for solving the problem which do a pretty good job. In this lab you will will compare several different "greedy" heuristics.

For a given set of packages, and a given bin capacity, it is easy to compute the theoretical lower bound on the number of required bins. The following algorithm will do it:

Given: P a set of packages. Each package, pi has a weight of Pi.weight 
       C the capacity of the bins 
Let:   TW = sum of all package weights
       B  = TW / C 
if B==floor(B) 
   minimum number of containers is B
else
   minimum number of containers is floor(B)+1

Bin Packing Algorithms

"On-line" algorithms

The on-line algorithms work under the assumption that packages to be put into bins arrive sequentially and have to be dealt with one at a time. In practice, you may put the package (weights) into an integer array, but
  • you cannot change the order of the array
  • you have to place item i into a bin before placing item i+1 into a bin
  • you cannot move items once that are in a bin

Next Fit Algorithm

When processing next item, check if it fits in the same bin as the last item. Use a new bin only if it does not. For example, suppose you have a bin capacity of 10, and packages of weight: 8,4,9,2. Then the algorithm does the following:
  1. Get first package, it weighs 8. No bins are available so get one place the package into the bin. The number of bins used is 1
  2. Get next package, it weighs 4. the current bin has capacity 2 which is less than 4, so get a new bin and place the package into the bin. The number of bins used is 2
  3. Get next package, it weighs 9. the current bin has capacity 6 which is less than 9, so get a new bin and place the package into the bin. The number of bins used is 3
  4. Get next package, it weighs 2. the current bin has capacity 1 which is greater than 2, so place the package a new bin. The number of bins used is 4
 
Algorithm: NextFit
Given: W: an array of weights of packages.
       C: the capacity of a bin

Let wc=0  // weight of the current package 
    tw=0  // the amount of weight in the current bin
    bc=1  // the number of bins used

Repeat
    if there are no more packages
        break 
    wc = weight of next package
    if tw+wc > C
       increment bc
       set tw = wc 
    else
       set tw = tw + wc

return bc

Best Fit Algorithm

When processing a package, put it into the bin such that the bin gets closest to full (from among those bins already in use). If the package does not fit into any bin in use, put it into a new bin. For example, suppose you have a bin capacity of 10, and packages of weight: 8,4,9,2. Then the algorithm does the following:
  1. Get first package, it weighs 8. No bins are available so get one place the package into the bin. The number of bins used is 1
  2. Get next package, it weighs 4. There are no bins which have a spare capacity of 4 or more, so get a new bin and place the package into the bin. The number of bins used is 2
  3. Get next package, it weighs 9. There are no bins which have a spare capacity of 9 or more, so get a new bin and place the package into the bin, so get a new bin and place the package into the bin. The number of bins used is 3
  4. Get next package, it weighs 2. Bin 0 has a capacity of exactly 2, placing this package into this bin gets it closer to full than any other place, so put it there. Bin 0 now has a spare capacity of 0. The number of bins used is 3
Algorithm: BestFit
    Given: W: an array of weights of packages.
           C: the capacity of a bin
           B: an array of bins, all initially empty 
    
    Let wc=0  // weight of the current package 
        bc=0  // the number of bins with packages
    
    Repeat
        if there are no more packages
            break 
        wc = weight of next package
        bb = -1  // the best bin to put the package in 
        bSpace = C // the amount of space remaining in the best bin
        bi = 0 // the index of the bin being considered
        while bi < bc
           Let t = B[bi]+wc 
           if t < C 
              if (C-t) < bSpace
                 set bb = bi
                 set bSpace = C-t 
           increment bi 
        if bb>=0
           B[bb] = B[bb]+wc 
        else
           B[bc] = wc 
           increment bc
    return bc

Worst Fit Algorithm

This algorithm is identical to Best Fit except that rather than trying to find the bin which you can come closest to filling, find the bin that keeps the most space open. For example, suppose you have a bin capacity of 10, and packages of weight: 8,4,9,2. Then the algorithm does the following:
  1. Get first package, it weighs 8. No bins are available so get one place the package into the bin. The number of bins used is 1
  2. Get next package, it weighs 4. There are no bins which have a spare capacity of 4 or more, so get a new bin and place the package into the bin. The number of bins used is 2
  3. Get next package, it weighs 9. There are no bins which have a spare capacity of 9 or more, so get a new bin and place the package into the bin, so get a new bin and place the package into the bin. The number of bins used is 3
  4. Get next package, it weighs 1. Bin 1 has a capacity of 6, placing this package into this bin leaves it further from full than putting it anywhere else, so put it there. Bin 1 now has a spare capacity of 5. The number of bins used is 3
Note that this algorithm has only two minor changes from the best fit algorithm.
Algorithm: WorstFit
    Given: W: an array of weights of packages.
           C: the capacity of a bin
           B: an array of bins, all initially empty 
    
    Let wc=0  // weight of the current package 
        bc=0  // the number of bins with packages
    
    Repeat
        if there are no more packages
            break 
        wc = weight of next package
        bb = -1  // the best bin to put the package in 
        bSpace = 0 // the amount of space remaining in the best bin
        bi = 0 // the index of the bin being considered
        while bi < bc
           Let t = B[bi]+wc 
           if t < C 
              if (C-t) > bSpace
                 set bb = bi
                 set bSpace = C-t 
           increment bi 
        if bb>=0
           B[bb] = B[bb]+wc 
        else
           B[bc] = wc 
           increment bc
    return bc

"Off-line" Algorithms

These algorithms work under the assumption that the packages to be binned all arrive before bin placement begins. The core / only difference between the on-line and off-line algorithms is that the off-line algorithms reorder the packages before beginning bin placement. For this lab, you will reorder by soring in both ascending and descending directions.

Once sorted, the next fit, best fit and worst fit algorithms can be applied without further modification.

Testing

There are a total of 9 algorithms to test: (on-line, off-line ascending, off-line descending) x (next fit, best fit, worst fit). During testing, you should compare all 9.

Once implemented you will test each on your algorithms on two criteria:

  1. Speed
  2. Closeness to lower bound
For the timing study, especially for the largest file you do not need to repeat and report an average time. (O the largest file, my implementation took a goodly amount of time -- several minutes -- for some algorithms.)

Data

The data files for this lab all of the following form:
  • Integer only, one integer per line
  • The first line is the size of the bins. You may assume you have an infinite number of bins, all of the same size.
  • All subsequent lines are package weights, again one weight per line
For example, here is a data file:
143
53
57
32
56
43
40
45
58
58
55
For this data, the output of the 9 algorithms is:
Algorithm          Bins   Speed(microseconds)
NextFit              4      1
BestFit              4      4
WorstFit             4      4
desc SortedNextFit   5      3
desc SortedBestFit   4      6
desc SortedWorstFit  4      7
asc SortedNextFit    5      3
asc SortedBestFit    5      7
asc SortedWorstFit   5      6
As the number of items to be put into bins grows, you can expect that the differences between the algorithms will be larger as well. In the data above, the numbers of bins is correct for the given data. The times numbers are completely made up. (for 10 items the time would be too short to measure accurately, so I did not even try.)

The data files are available on UNIX as

        /home/gtowell/Public/337/Lab13/sample.txt
        /home/gtowell/Public/337/Lab13/d100.txt
        /home/gtowell/Public/337/Lab13/d1000.txt
        /home/gtowell/Public/337/Lab13/d10000.txt
        /home/gtowell/Public/337/Lab13/d100000.txt
        /home/gtowell/Public/337/Lab13/d1000000.txt
        /home/gtowell/Public/337/Lab13/d10000000.txt
    
The biggest file has 10,000,000 packages (think FedEx on a busy night). sample.txt is identical to the data just above.

Data Table

A table or possibly 2 (one for time and one for bins) might be useful.
Data Set Number of packages Bin Capacity Theoretical Min Bins On line Off line, ASC Off line, DESC
Next Fit Best Fit Worst Fit Next Fit Best Fit Worst Fit Next Fit Best Fit Worst Fit

Some graphs might be interesting as well, or even as an alternative to these tables. (See requirements below.)

Data Sets

What to Hand In

A report (4 pages MAX -- I will not read text past page 4) covering the following:
  • Define "greedy algorithm" and why these algorithms are greedy.
  • High level algorithm descriptions
  • A theoretical and empirical study of the time requirements of each algorithm. Do the theoretical and empirical results agree? Explain.
  • Bin packing analysis. For example, are some of the algorithms consistently better than others (or worse). Explain why your results are consistent with expectations based upon your analysis of the algorithms. Why is sorting in ascending order so bad? Why does sorting make best fit and worst fit behave identically?
  • At least one table and one figure
  • An appendix with all code you wrote, well organized for readability. The appendix does not count against your 4 page limit.

Approximate Rubric

Topic Points
Intro / Contributions10
Algorithms10
Theoretical Analysis (Big-O)10
Data and analysis50
Appendix20

For you secret identity: 2 TV shows