This post continues my series on intractable problems. In this installment, I will talk about problems relating to Data Storage. As a refresher, remember that an intractable problem is one that is very computationally complex and very difficult to solve using a computer without some sort of novel thinking. I will discuss two famous problems related to Data Storage below, as well as provide a few examples and references.
Part Two — Data Storage
Knapsack — given a set of items with weights and values , and a knapsack with capacity , maximize the value of the items in the knapsack without going over capacity.
To start with, here is a small example that I refer to in this previous post. If you’re trying to place ornaments on a tree, you want to get the max amount of coverage possible. However, the tree can only hold so much weight before it falls over. What is the best way to pick ornaments and decorations such that you can cover as much of the tree as possible without it falling over?
I actually saw a really interesting video describing an example of this problem in this video (watch the first few minutes). The professor here sets up a situation of Indiana Jones trying to grab treasures from a temple before it collapses. He wants to get the most valuable treasures he can, but he can only carry so much.
Class: There are several different types of knapsack problems, but the most common one (the one discussed above) is one-dimensional knapsack. The decision problem (can we get to a value without exceeding weight ?) is NP-Complete. However, the optimization problem (what is the most value we can get for the least weight?) is NP-Hard.
Wikipedia Page – General discussion of the Knapsack problem, different types, complexity, and a high level view of several algorithms for solving
Coursera Course on Discrete Optimization – The source for the above video and a great discussion of not only Knapsack but quite a few of these problems
Knapsack Problem at Rosetta Code – a good example data set and a variety of implementations in different languages
Bin Packing — given a set of items with weights and a set of bins with capacity each, place the items into the bins such that the minimum amount of bins are used.
This problem is very similar to the knapsack problem found above, but this time we don’t care how much the items are worth. We just want to pack them in the smallest area possible. Solving this problem is invaluable for things like shipping and logistics. Obviously, companies want to be able to ship more with less space.
A more commonplace example could be thinking of this problem as Tetris in real life. Consider that you’re moving to a new place, and you have you and your friends car in which to move things. How can you place all the items in your cars such that you take up space the most efficiently?
Some progress has been made on reasonably large data sets by using what is called the “first fit decreasing algorithm”. This means that you pick up an item, and place it into the first bin that it will fit in. If it can’t fit in any of the current bins, make a new bin for it. Decreasing means that before you start placing items, you sort them all from biggest to smallest. You probably do this in your everyday life. If you want to pack a box, you start with the big items, right? No need to put lots of small items on the bottom. By getting the big items out of the way first, you can be more flexible with the remaining space because you will then have smaller items.
Class: This problem is NP-Hard.
Wikipedia Page – a high level description of the problem
First Fit Decreasing Paper (pdf) – this is a technical paper describing computational bounds for using the first fit decreasing algorithm. Not for beginners.
3D Bin Packing Simulation – looks like a resource for companies to use to pack boxes and such
I hope this provided a little taste of why these problems are so important. If you know any other good resources please let me know.