# Intractable Problems — Part Two: Data Storage

This post continues my series on intractable problems. In this installment, I will talk about problems relating to Data Storage. As a refresher, remember that an intractable problem is one that is very computationally complex and very difficult to solve using a computer without some sort of novel thinking. I will discuss two famous problems related to Data Storage below, as well as provide a few examples and references.

Part Two — Data Storage

Knapsack — given a set of items with weights $w$ and values $v$, and a knapsack with capacity $C$, maximize the value of the items in the knapsack without going over capacity.

To start with, here is a small example that I refer to in this previous post. If you’re trying to place ornaments on a tree, you want to get the max amount of coverage possible. However, the tree can only hold so much weight before it falls over. What is the best way to pick ornaments and decorations such that you can cover as much of the tree as possible without it falling over?

I actually saw a really interesting video describing an example of this problem in this video (watch the first few minutes). The professor here sets up a situation of Indiana Jones trying to grab treasures from a temple before it collapses. He wants to get the most valuable treasures he can, but he can only carry so much.

Class: There are several different types of knapsack problems, but the most common one (the one discussed above) is one-dimensional knapsack. The decision problem (can we get to a value $V$ without exceeding weight $W$?) is NP-Complete. However, the optimization problem (what is the most value we can get for the least weight?) is NP-Hard.

References:

Wikipedia Page – General discussion of the Knapsack problem, different types, complexity, and a high level view of several algorithms for solving

Coursera Course on Discrete Optimization – The source for the above video and a great discussion of not only Knapsack but quite a few of these problems

Knapsack Problem at Rosetta Code – a good example data set and a variety of implementations in different languages

$============================$

Bin Packing — given a set of items with weights $w$ and a set of $n$ bins with capacity $c$ each, place the items into the bins such that the minimum amount of bins are used.

This problem is very similar to the knapsack problem found above, but this time we don’t care how much the items are worth. We just want to pack them in the smallest area possible. Solving this problem is invaluable for things like shipping and logistics. Obviously, companies want to be able to ship more with less space.

A more commonplace example could be thinking of this problem as Tetris in real life. Consider that you’re moving to a new place,  and you have you and your friends car in which to move things. How can you place all the items in your cars such that you take up space the most efficiently?

Some progress has been made on reasonably large data sets by using what is called the “first fit decreasing algorithm”. This means that you pick up an item, and place it into the first bin that it will fit in. If it can’t fit in any of the current bins, make a new bin for it. Decreasing means that before you start placing items, you sort them all from biggest to smallest. You probably do this in your everyday life. If you want to pack a box, you start with the big items, right? No need to put lots of small items on the bottom. By getting the big items out of the way first, you can be more flexible with the remaining space because you will then have smaller items.

Class: This problem is NP-Hard.

References:

First Fit Decreasing Paper (pdf) – this is a technical paper describing computational bounds for using the first fit decreasing algorithm. Not for beginners.

3D Bin Packing Simulation – looks like a resource for companies to use to pack boxes and such

$============================$

I hope this provided a little taste of why these problems are so important. If you know any other good resources please let me know.

# Intractable Problems — Part One: Set Problems

My professor and advisor Dr. Alice McRae provided a list of intractable problems for us to ponder in our genetic algorithms class, and I thought I would expand on some of them here for reference. All of these problems are intractable, which means that they are very, very difficult to solve precisely with a computer.

Most of the problems on this list are what is known as NP-Complete problems. If the complexity classes P and NP are not equal (as is widely believed by many researchers, but not proven), then NP-Complete problems cannot be solved by a computer in a reasonable time frame. In theory, with an infinite amount of time we could produce answers to these problems but time and computing power is finite, no matter how many technological advances we make.

We have seen some of these difficult problems before in previous posts: Genetic Algorithms for Ramsey Theory and The Travelling Santa Problem, as well as Introduction to Genetic Algorithms all have good examples of these types of problems.

I will be presenting these problems in multiple parts, with my comments and references on each one.

Part One — Set Problems

Maximum 3-Dimensional Matching — given a set $S$ of ordered triples of the form $(x,y,z)$, find the largest possible subset of the triples, such that no two elements in the triple share the same $x$, $y$, or $z$ coordinate.

Here is a small example: Consider the set $\{(1,4,5),(3,4,9),(6,7,8),(1,2,5)\}$. We need to find the largest subset of these we can, and the $x,y,z$ values cannot be repeated. In this example, the set $\{(3,4,9),(6,7,8),(1,2,5)\}$ would be a solution. It contains 3 triples and none of the numbers are repeated. As you can imagine, once the sets gets larger, this problem becomes much more difficult.

In layman’s terms, consider a group of boys, girls, and pets. We want to make happy “triplets” of girls, boys, and pets, but no girl, boy or pet can be in more than one group. What is the best way to match these people and pets up so that we have the largest number of groups?

Class: 3-Dimensional Matching (finding any subset that satisfies the conditions) is NP-Complete. The optimization problem (finding the largest subset) is NP-Hard. [1]

References:

More NP-Completeness Results (pdf) lecture notes from CMU, good explanation of 3DM as well as some other problems, proof of the class of 3DM.

NP-Complete Problems (pdf, pg 267)

$============================$

Subset Sum/Subset Product — given a set $S$ of integers and a goal sum/product $P$, find a subset of $S$ that sums/multiplies as close as possible to the goal of $P$.

This one doesn’t sound as hard, but at larger quantities, this can become very difficult.

Example: Consider the set $S = \{ n |1\leq n \leq 100 \}$ and a goal sum $G = 531$. Now we need to find a set of numbers between 1 and 100 that we can add together to get 531.

A practical application of this: Say you’re a kid and your parents gave you $10 of allowance for the week. Naturally, you want to get as much good out of that$10 as you can. What is the best combination of things you can buy to add up to as close to \$10 all together?

Class: Subset sum and subset product are NP-Complete. Proof can be found at [2]

References:

Subset Sum NP-Completeness (pdf) Scroll down a bit to see the proof that subset sum is in NP-Complete

Dynamic Programming Subset Sum — A description of a dynamic programming technique for this problem

SubsetSum@Home — A distributed crowdsourced BOINC-type initiative to solve subset sum problems

An Improved Low-Desnity Subset Sum Algorithm — a paper concerning algorithms to solve this problem

$============================$

Minimum Set Covering — given a set $S$ and a collection $C$ of subsets over $S$, find the fewest number of subsets of $C$ such that all elements of $S$ are represented.

Okay so lets break this down. We have a set $S$, lets say for example $S=\{1,2,3,4,5\}$. Now we have a collection $C$ of subsets over $S$. For our example, $C = \{\{1\},\{2,5\},\{3,4\},\{2,3,4\},\{5\}\}$.  In this case, the smallest collection we can make that includes all the elements in $S$ is $\{\{1\},\{2,3,4\},\{5\}\}$ which contains 3 elements.

Practical example: You’re a kid again, and looking at a group of video games that are on sale. They are all in combo packs, however. How can you get all the games you want and spend as little money as possible? In other words, what is the smallest amount of combo packs you need to buy in order to get all of the games you want?

Class: the decision problem (does this set contain all the elements we’re looking for?) is NP-Complete. The optimization problem (is this set the smallest set we can make?) is NP-Hard. See [3].

References:

A Probabilistic Heuristic for a Computationally Difficult Set Covering Problem – (pdf) Journal article on this topic detailing a heuristic for finding set coverings

A Genetic Algorithm for the Set Covering Problem – (pdf) Journal article about a genetic programming approach to set cover

A Greedy Heuristic for the Set-Covering Problem – (pdf) Yet another approach

Set Cover Problem — Wikipedia

An Example: Set Cover – (pdf) Scroll down to see a proof of set cover in NP-Complete

$============================$

These are only a few intractable set problems, but there are many more variations of these out there. Stay tuned for the next segment in this series, problems about Data Storage (Bin Packing and Knapsack).

# Kinect Face Tracking — Results: Thesis Update #4

For background, see my three previous posts in this series:

My thesis has been successfully completed and defended now, and I am currently on break for the summer. I thought I would make a post to wrap up some loose ends and talk about some things I didn’t have a chance to talk about before. In my last post, I discussed the significance of animation units to my facial tracking algorithm. Now the way I use these units is pretty simple, but can lead to some complex classifications. The first thing to do is to consider what the desired face would look like. Picture it in the mind’s eye. Say for example we wanted to recognized a surprised face. How would this look? Chances are, when thinking of a surprised face, the mind visualizes raised eyebrows, wide open eyes, and a gaping mouth. The next task (and the bulk of my algorithm) asks: How do we translate these facial movements into the points tracked on our facial mesh?

We use animation units as “wrappers” for sets of points on the face plane. Instead of having to track and check multiple different places on the eyebrows or mouth for example, the animation units allow us to track the movements of those features as a whole. Since the animation unit values lie between -1 and 1, we must assign “bounds” for each expression to where if the user’s features fall within that range, we can assume the user is creating that expression. These values at present are determined by extensive testing and seeing what values are frequently displayed when creating a certain expression. It would not be difficult to build a classifier for these expression bounds, and use it to train the program over multiple different faces and expressions in order to get the best and most accurate data for each type of face.

In my application, we track six different types of expressions.

surprised

kissing

smiling

In addition, we look for two different types of angry faces: angry with a closed mouth (glaring at someone) or angry with an open mouth (as in yelling).

angry – closed mouth

angry – open mouth

To see a simple flowchart detailing some preliminary bounds for each expression (not exhaustive), check out the chart here (click for larger view).

There is a bit of a “lag” in my application on recognizing these expressions, because the live video stream captures many frames each second, and there is a tiny bit of delay in figuring out what expression that frame’s data fits into. As such, the output of my program is a bit inaccurate still. Because it prints off what the expression every frame, there can be a bit of a buildup and after a while it will start showing expressions at a bit of a delay. For example, if the user acts surprised sometimes the program will not actually print “surprised” for a fraction of a second afterwards, because its busy trying to run through the frames as they come in. A simple remedy to this would be to create a “buffer” of tracked points and use the average of the data over a few seconds in order to determine the facial expression. Because the camera is very sensitive, we are prone to having the data change at the slightest movement of the face. Indeed, even trying to sit as still as possible still results in some small changes in the data. Another thing I noticed that creating a buffer of data could help solve is when the camera loses track of the face for only a moment, it begins to spit out garbage data as it attempts to relocate the face.

Overall we can see a good proof of concept of the capabilities of the Kinect Face Tracking API and there is a lot of room for improvement in the future. Possible future additions/enhancements include:

• tracking a wider range of expressions
• wider range of accessibility (glasses/hats, children, elderly people)
• more specific bounds for facial expressions, use neural networks or something
• more interactive application
• use facial expression recognition to interface with other environments (i.e., call a command by smiling)

# A Genetic Algorithm Approach to Ramsey Theory

Background and Introduction

Ramsey Theory is the study of combinatorial problems proposed by Frank Ramsey in 1930. The version of his problem as applied to graphs asks, “what is the smallest complete graph such that there is at least one clique of a given size and color?” This can be represented by $R(x,y)$ where x and y are the sizes of the cliques to find for red and blue, respectively.

$R(3,3)$ means to find the smallest graph such that we are guaranteed either a red 3-clique (3 vertices connected together at every point, or a complete graph on 3 vertices) or a blue 3-clique. This has been shown to be 6. We can show this by providing an example that a complete graph on 5 vertices ($K_5$) can be colored such that it contains no red or blue 3-cliques. Then, we show that for every coloring of $K_6$, there must necessarily be a red triangle or a blue triangle in the graph. This can be verified computationally or combinatorially. (See the nice example here: Theorem on friends and strangers)

While the problem is relatively easy to solve for a small value like $R(3,3)$, the complexity of the problem increases greatly when considering larger clique sizes. For instance, $R(4,4)$ is 18, and we only know that the value of $R(5,5)$ is somewhere between 43 and 49 inclusive. Because the number of cases one must check increases exponentially with each increase in clique size, it becomes impractical for traditional computing very quickly.

Therefore, we will utilize a genetic algorithm in an attempt to verify and perhaps improve the lower bound of $R(5,5)$. To do this, the program must complete the following:

• Use the genetic algorithm to test for graphs that have low numbers of cliques

• Perform exhaustive testing on these graphs, and if we can find even one coloring where a clique does not exist, we will have shown that $R(5,5) > 43$.

Implementation

My algorithm is implemented using Java. Graphs are represented using an adjacency matrix, and a coloring of the graph (ColorMatrix) are 2D boolean arrays. In this scheme, coloring[x][y]==0 represents a blue edge from x to y and coloring[x][y]==1 represents a red edge. The colorings are symmetric; that is, coloring[x][y] == coloring[y][x]. These colorings form a base for my Chromosome class, which is used in the genetic algorithm to represent a “piece” of information that can be mated with other chromosomes, mutated, and scored based on the number of cliques found. A population stores multiple chromosomes.

My basic algorithm works as follows:

1. Create a population of random colorings of $K_{43}$
2. Check a set of edges for 5-cliques (how these sets are determined will be discussed later). Assign a fitness score based on the number of cliques found

3. Crossover/mutate the chromosomes

4. Run this many times until every member of the population has fitness 0 (no cliques found in that set of edges)

5. Take these “possible zeroes” and begin to test more edges to find cliques

1. if we find any new cliques, this coloring is no longer any good to us (we are looking ultimately for a coloring with no cliques whatsoever)

2. pare down the population to a set of graphs that have no cliques in set 1 and set 2

3. eventually this leads to an exhaustive search but by that point there will be only a few “likely” candidates that passed every test set (this saves computation time rather than checking every clique right off the bat)

Calculating Fitness

The fitness function for our genetic algorithm checks a set of edges in the graph to see if there are any cliques. We test from “test sets” instead of checking every clique every time. The reasoning behind this is that if a clique is found on the first test, we need not go any farther. The multiple-stage fitness testing allows us to prune out “bad” data and get more likely solutions to our problem. The first test set data is computed as follows (each 5-tuple represents a set of five edges to check for a clique — that is, all the same color)

$(0,1,2,3,4), (0,2,3,4,5), (0,3,4,5,6),$

$\dots (0,39,40,41,42), (1,2,3,4,5), (1,3,4,5,6),$

$\dots, (38,39,40,41,42)$

We use a simple method of counting up iteratively the first two items in the list, then using increasing consecutive numbers for the rest of the clique. This way, lots of cliques are tested, but this test set leaves out cliques that are not formed from consecutive edges (e.g., $(0,2,4,6,8)$ would not be tested in this list). Once the list of edges to check is generated, we simply test the edges on each node in the set. If the colors are all the same, then we have a clique and add to our fitness.

The second test set would be applied after the population has converged for the first time to zeroes, and this time counts up by multiples of 2 (so $(0, 2, 4, 6, 8)$ would be a viable choice in this set).

An alternative fitness function was considered that involves inserting vertices into a binary search tree based on their coloring values. To do this, first one creates a random permutation of edges (values 0-42). Then using the colorings (0 and 1 instead of positive and negative) we can insert the nodes either left or right in a binary search tree with the first node in the list as the root. By looking at the output of this tree, we can get a visual representation of color patterns. The idea is that by finding a long “path” in one of these trees we know that it has seen the same color many times, thus suggesting a clique exists.

In the below example, the set $(0,1,2,3,4)$ forms a long “path”, meaning that each edge is the same color.

Unfortunately, this implementation is not currently functional in my program.

Parent Selection and Crossover

The backbone of any genetic algorithm is the crossover. This models the mating process in traditional biology and is used to make the Chromosomes (solutions) improve themselves over many generations by keeping successful traits and discarding others. Parents are selected in my model randomly from the population, but there is a small chance (about 5%) that the best member of the population would be chosen as a parent instead. This provides a slight bias toward better traits but still allows for plenty of randomness and new data.

The crossover is a simple one. Using two Chromosomes (colorings of the graph) we use a random value for each x, y in the colorMatrix to determine whether the “mom” or the “pop” will contribute their genes (coloring[x][y]). There is a 50-50 chance of inserting mom(x,y) into the baby as opposed to pop(x,y). The “baby” Chromosome results in a mixture of data from both parents, and if the new Chromosome’s fitness (number of cliques) is lower than the current worst value in the population, we replace that value with our new baby. This allows the population to improve over time.

Our mutation method simply flips a bit in the coloring based on a random value. The mutation rate was 0.08, or 8%.

Problems and Solutions

One of the things I quickly noticed when building this project was that the fitness function, although it does not brute force every possible clique check, can still take a prohibitively long time at higher population sizes. The getFitness() function is called a bit too liberally, and saving the fitness value to retrieve later would improve the efficiency in this area.

Another downfall is that since we are only testing a limited set of cliques each time, it is possible (although desirable, given the current approach) to “learn the data”, and evolve a solution that contains no cliques in the tested area, but may have cliques elsewhere. While this seems like a problem at first glance, we can use this to our advantage because having the algorithm “learn” each test data set is what allows us to prune the search space and find possible colorings that do not contain cliques. If we decided this was an undesirable behavior however, we could change the fitness function to include x random permutations of edges each time. The problem with this method is that we could never be sure that a clique certainly did not exist in a certain area over multiple rounds, as the sets tested would be randomized each time. Finally, it is possible to go with the old-fashioned approach and simply check every clique after all.

Future Work

While this program currently provides a proof of concept and implementation up to the first round of test data, there is much room for expanding this project. First of all, multiple rounds need to be completed within the algorithm, so that we can generate colorings that are more likely to have no cliques. Other things that would be useful are outputting data to a text file for further reading/analysis, and performing general optimizations throughout my code for performance.

In addition, I would really like to get a working implementation of the Alternate Fitness function implemented and working. I feel like this is an interesting direction to take the project and it would be helpful to have a more visual representation of where cliques are occurring.

While the population initialization currently creates a population of random colorings, it would be interesting to see what happens when we change the ratios of red edges to blue edges in the chromosomes, starting at initialization. What if each coloring was required to be within a certain amount red and a certain amount blue? This is definitely something to check into as well.

Conclusion

Currently, we are able to generate lists of colorings that do not have cliques within the first test set of edges. As we add on more edge sets to test, we will create more constrained lists. Further testing will reveal if any of these “possible” solutions truly do not contain such cliques. For this, an exhaustive search is the only way. Although it is on the order of 900,000 edge combinations (${43 \choose 5} = 962598$) to check, this is time consuming but doable, especially if we have a relatively small set of graphs to check.

Finally, the code is available on https://github.com/nelsonam/ramsey if anyone is interested. Share your comments in the box below and if you have any questions feel free to ask!

# Animation Units for Facial Expression Tracking – Thesis Update #3

See the previous posts in this series:

So the past few months, I’ve been hard at work on my thesis concerning facial expression analysis with Microsoft Kinect. Unfortunately my blog has suffered quite a bit in wake of everything I’ve had going on, but I’m trying to post a few new things as the school semester winds down.

My ongoing project concerning Facial Expression Analysis with Kinect is making progress, and as the semester winds down, I am trying to prepare a final product to present. As I described in my last thesis update, I had been able to test a sample product which overlays a 3D mesh on the user’s face and tracks the different points on the face. In particular, a subset of these points are called Animation Units, or AUs, and are essential to the expression recognition algorithm. The points are defined based on the Candide3 model and the different values are delineated on Microsoft’s API page about the Kinect Face Tracking SDK. There are six different values that describe movements and placement of the basic facial features such as eyes, eyebrows, and mouth, and the table of values for the AUs is reproduced below (from Microsoft’s Kinect API page):

 AU Name and Value AU Value Interpretation AU0 – Upper Lip Raiser 0=neutral, covering teeth 1=showing teeth fully -1=maximal possible pushed down lip AU1 – Jaw Lowerer 0=closed 1=fully open -1=closed, same as 0 AU2 – Lip Stretcher 0=neutral 1=fully stretched (joker’s smile) -0.5=rounded (pout) -1=fully rounded (kissing mouth) AU3 – Brow Lowerer 0=neutral -1=raised almost all the way 1=fully lowered (to the limit of the eyes) AU4 – Lip Corner Depressor 0=neutral -1=very happy smile 1=very sad frown AU5 – Outer Brow Raiser 0=neutral -1=fully lowered as a very sad face 1=raised as in an expression of deep surprise

When these values are considered, I was able to determine a set of bounds for what expression would represent in my application. However, I learned quickly that the extremal values of -1 and 1 are just that, extremal. Even sitting in front of the camera making the most exaggerated faces I could, it was very difficult to get above a .5 or .6 in some areas. In addition, the eyebrow data was almost always inaccurate in my testing because I wear glasses and this confused the camera. The Kinect saw the top of my glasses as my eyebrows and thus, showed not very much movement at all. When I took my glasses off, placement and tracking returned to normal, but it was impossible for me to see the data being processed.

Another unforseen problem that I ran into was that the camera is perhaps a bit too sensitive, and since it is tracking many frames every second, even if you sit as still as possible there will still be some variation in the tracked points, and if the camera loses track of the face entirely, the points can really go haywire. Therefore, trying to keep the points within specific bounds is more difficult than expected. If the camera loses track of the face the application can say you went from surprised to sad to angry in less than a second. It will probably not get done in this version of the project, but it would be useful in the future to store a buffer of points perhaps every second or so so that it stabilizes the constantly changing data and provides more time for processing before changing frames completely.

Overall I’ve certainly learned a lot throughout the course of this project and I would like to continue to work on it more throughout the summer. I will post a few more thoughts on the process and results as I carry on and wrap things up. I’ll have a copy of my thesis online at some point too.

In other news, I’m going to be starting the Master’s program in Computer Science at Appalachian State University in the fall, and I’m pretty excited about it. I plan to do research on algorithms and graph theory.

Stay tuned…

UPDATE: See the next post in this series here: Kinect Face Tracking — Results :  Thesis Update #4

# Some Face Tracking With Kinect: Thesis Update #2

See the previous post in this series: Facial Expression Analysis With Microsoft Kinect : Thesis Update #1

I’ve been doing some more work on my emotion recognition project and I have a few preliminary pictures to show. The current status of the project is that I can detect faces and see the wire mesh over them, but it’s not exactly as accurate as I would like. For instance, I have noticed that wearing glasses seems to confuse the eye placement. My eyes in the wire mesh are consistently hovering above the top rim of my glasses. If I take them off, this effect is lessened. Another limitation that I noticed is that if someone is wearing a hat it will often not detect the face at all. I brought the equipment to my parent’s house last weekend and let them try it out. My dad always wears a baseball cap, and the Kinect could not recognize his face until he removed it.

Below you will see some facial expression “archetypes” that I have developed using my own face. You will notice that “sad” is not included in the list. Because of the inaccurate eye and eyebrow placement, I cannot get it to show the upturned “sad eyebrows” that I wanted. In addition, as much as I frown, it looks like the wire mesh is making a kissing face.

Definitely some things to work out but check out the faces I’ve worked on so far. I plan to get multiple people to make faces at the Kinect and see if I can get them into groups that the program can further “learn” with.

“Normal face” – a baseline

Angry face – note the eyebrows

Happy Face – smile!

Surprised face – heightened eyebrows and open mouth

As you can see, eyes are much better placed without glasses…

UPDATE: See the next posts in this series:

# Facial Expression Analysis with Kinect: Thesis Update #1

UPDATE: some things in this project have changed since writing this post, please see the series of posts I wrote on this below:

It’s my final semester of undergraduate work in computer science, and as such I will be completing an undergraduate honors thesis on a topic of my choosing. I plan to post updates here on my progress and things I am working on, both for my own benefit and tracking and also in the hopes that someone else may find my work interesting. The project that I am working on is an extension of a project started by a former graduate student at my university. Using a Kinect sensor and a Bioloid humanoid robot he was able to get the Kinect to track movements of a person standing in front of the robot and then send that information to the robot and have it imitate these movements. If you are interested what this looks like you can find a video here.

What I plan to do with my project is use the Kinect Face Tracking API to analyze facial expressions using the Kinect sensor. The face tracking API tracks a number of points on the face as shown here: (image from msdn)

We can track the angles and positions of each of these points on the face and use them to determine whether the subject has a “happy” face or a “sad” face, among other different expressions. What I plan to do afterwards is to send this data to the Bioloid robot. The robot will then be able to react in a manner appropriate to the expression he sees. For example, if the robot detects a happy face, he may make a clapping motion, while if he sees a sad face, he will react in a different manner. Applications of emotion recognition are springing up across the board. With this project I hope to both learn some new things and have a fun, interactive project at the end.