Introduction to P vs. NP

800px-Jigsaw_puzzle_01_by_Scouten

Its a Millennium Problem (reward $1,000,000) and the subject of a movie. But what is this mysterious problem, and what makes it so hard?

Creation vs. Verification

Pop quiz: Choose which one of these tasks takes longer.

  • A) someone hands you a jigsaw puzzle and asks you if it is finished or not
  • B) someone hands you a jigsaw puzzle and asks you to put it together

Answer: BWhile you can determine if a puzzle is finished or not in a split second (it is easy to tell if pieces are missing), it is not so easy to put all those pieces together yourself. We have problems like this in computer science and mathematics as well. Think about this: what if there was a method that could solve a jigsaw puzzle as fast as you could check if its right? Wouldn’t that be amazing? That’s what this problem seeks to find out: if such a method exists.

The Class P

There is a class of problems that can be solved in “polynomial time”, and we call this P. Polynomial time means that as the complexity of the problem increases, the time it takes to solve increases at a rate no greater than a polynomial would. Informally, we can say that for P problems, we can solve the problem “quickly”. (Polynomial time).

Here’s an example: Say we have a list of numbers in front of us, and we want to pick out the number that is the greatest. At the very worst, we could start at the beginning of the list and compare every number until we got to the end of the list. Yes, the list may be very very long. But the time it takes to check every number increases in a linear fashion. More numbers does not make it exponentially more difficult. If we compare the growth rates of linear time O(n) (graph A) and polynomial time O(n^2) (graph B) we can clearly see that solving this problem is quicker than polynomial time, thus it is in P.

yeqx2

Graph A

Graph B

Graph B

The Class NP

cookies

“Mom, she got more than me!”

The class of NP problems, put simply, can be verified in polynomial time. NP stands for “nondeterministic polynomial time” and harkens back to a computational construct known as a Turing Machine. Consider this example: a woman is dividing up cookies of different sizes into two groups for her two young children. Naturally, they are very picky about things being fair, so she needs to make sure that each child has exactly the same amount of cookies (by weight). While we can show that this problem can get quite hard to sit down and solve, if someone showed us two piles of cookies we could quickly add up the weights of the two piles and tell if they were the same or not. 

If you have only 2 cookies to separate out, sure, its easy. Put one cookie in each pile and that’s the best you can do. But what if you had 5 cookies? 10? Keep in mind that each cookie weighs a slightly different amount and we want the two groups of cookies to be equal in weight as much as possible. Adding more cookies to work with increases our options for separating them exponentially. This is not good. If she has 5 cookies, the number of different ways to separate them is 2^5 = 32, but if she doubles the amount of cookies to 10, the possibilities skyrocket because 2^{10} = 1024.

Let’s consider, for fun, the number of possibilities if she had 100 cookies! 2^{100}=1,267,650,600,228,229,401,496,703,205,376. This…is pretty large. But wait! Computers can do that in a heartbeat, right? They’re way faster than we are! This is all fine and good, until you realize that the number of seconds in the age of the entire universe is (only) 450,000,000,000,000,000. So even if our computer could go through thousands or millions of computations per second, it would still take longer than the entire age of the universe. This is what makes these sorts of problems so hard.

P = NP?

So now that you know a little about what P and NP are, you may be wondering what all this business about “P=NP” is. What does it mean, after all? If someone were to prove that P equals NP, this would mean that every problem in the class NP (the hard ones) can be solved in a polynomial time, just like the P problems. This would have huge implications for all sorts of applications, not just in the mathematical world. For instance, a cornerstone of many security systems rests on the difficulty of factoring very large numbers. If P=NP, it would mean that there exist very easy methods for factoring these numbers, and as such, financial systems everywhere would be vulnerable.

This can also be applied to many logistical and optimization problems (see my posts on intractable problems here and here). For many of the NP problems, finding an “easy” method to solve one can be applied to any of the problems in that class. We don’t need to show an easy method to solve every problem in the world. We only need one. That shouldn’t be too hard, right? Unfortunately, researchers have been working on this problem for decades without success. There have been many attempted “proofs” but they all ultimately have fallen short. The problem is so difficult that the Clay Mathematics Institute is offering $1,000,000 for a correct proof of this problem one way or another.

Although an official answer to this question has not been decided one way or another, many professionals believe that P \neq NP. This means that problems in NP will always be inherently “harder” than problems in P, and at the worst case require an exhaustive (brute force) search to find a solution. No matter how good our computers get, these problems will still be very difficult for us to solve, especially at large cases.

I hope you enjoyed this little introduction to P and NP and if you have any comments, questions or corrections, please leave them in the comments below!

Advertisements

6 comments on “Introduction to P vs. NP

  1. freemancw says:

    Nice article, I like the examples :). Some criticism, if you care to hear it:

    “If we compare the growth rates of linear time (graph A) and polynomial time (graph b) we can clearly see that solving this problem is quicker than polynomial time, thus it is in P.”

    Linear functions are just degree one polynomials, so this statement could be misleading. It’d help to label the axes and consider comparing the linear function to a more concrete polynomial like n^2 or something.

    “While we can show that this problem can get quite hard to sit down and solve, if someone showed us two piles of cookies, we could quickly add up the sizes of the two piles and tell if they were the same or not.”

    By “size” you mean weight right? If so, consider using that word instead, I wasn’t sure if you meant number of cookies.

    • Hi Clinton,

      >> “Linear functions are just degree one polynomials, so this statement could be misleading.”

      You are correct. I should have been more clear in that part. The graphs that I displayed were y=x and y=x^2. I agree that some labels and descriptions, as well as a better choice of wording would be beneficial in this case.

      >> “By “size” you mean weight right? If so, consider using that word instead, I wasn’t sure if you meant number of cookies.”

      Yes, I meant the weight of each pile. We only want to check to see if the weights are the same, and in this context; “close” doesn’t count. I was adapting my cookie problem from a more classic example with rocks and realized “who counts cookies by weight anyway?” But yeah, I’ll fix that. Thanks.

      Hope school is going well for you.

  2. tommy says:

    You say that NP problems need polynomial time to VERIFY, but your cookie example can’t be verified in polynomial time, right? You have two piles of cookies (each cookie with a slightly different weight) and the piles themselves will have more or less slightly different weights. You cannot decide whether this is the best partition of the whole adundance of cookies in P time.
    Or do I miss something.
    Thanks for this nice introduction anyway.

    • Hi Tommy,

      I agree that I did not clarify as much as I could have. In this problem, we want the piles to have exactly the same weight; that is the only “correct” answer in this situation. Therefore, adding up the weights of the two piles and seeing if they are equal can be done in polynomial time. Of course, sometimes it is not possible for the cookies to be split evenly, and this makes the problem harder.

      Thanks for pointing this out.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s