A Genetic Algorithm for Computing Ramsey Numbers: Update

Friends_strangers_graph

All the 78 possible friends-strangers graphs with 6 nodes. For each graph the red/blue nodes shows a sample triplet of mutual friends/strangers.

In my last post on this topic, I discussed how I was working on a genetic algorithm to search mathematical graphs for elusive properties called Ramsey Numbers. (For a refresher on genetic algorithms,  visit here, and for a refresher on Ramsey Numbers, visit here). I’ve been doing some work on it since then (check out the code here), and I thought I would describe some improvements and further progress I’ve made in this area.

New features:

  • colorings dumped to a file at the end of each run
  • ability to load in data sets from file, further refining of the data than starting from scratch each time

The next problem I ran up against while working through this was that even if I am able to load in previously analyzed data, I still only have one fitness function that checks a static set of edges. As I see it, there are two ways to solve this:

  • Make the current fitness function dynamic; that is, it tests a different set of edges every time. However, this is counterproductive to the purpose of the program “eliminating” certain sets of edges in each “round”. However, this would be easier to maintain than the other option, which is
  • Make a “FitnessHandler” method that takes in a value for which method to run, and uses that to determine what set of edges to test. However, this would lead to a lot of extra code and overhead. I’m thinking having a static variable at the beginning of each run with what “fitness method” to start on, so that it doesn’t have to start on round one each time.

I haven’t fully decided which of these I will go with. I feel like the second one fills my purpose of methodically “weeding out” the improbable graphs, but its going to be a lot of extra work. Oh well, nothing worthwhile ever came easy…

Leave a note here or on my github if you have suggestions!

Introduction to P vs. NP

800px-Jigsaw_puzzle_01_by_Scouten

Its a Millennium Problem (reward $1,000,000) and the subject of a movie. But what is this mysterious problem, and what makes it so hard?

Creation vs. Verification

Pop quiz: Choose which one of these tasks takes longer.

  • A) someone hands you a jigsaw puzzle and asks you if it is finished or not
  • B) someone hands you a jigsaw puzzle and asks you to put it together

Answer: BWhile you can determine if a puzzle is finished or not in a split second (it is easy to tell if pieces are missing), it is not so easy to put all those pieces together yourself. We have problems like this in computer science and mathematics as well. Think about this: what if there was a method that could solve a jigsaw puzzle as fast as you could check if its right? Wouldn’t that be amazing? That’s what this problem seeks to find out: if such a method exists.

The Class P

There is a class of problems that can be solved in “polynomial time”, and we call this P. Polynomial time means that as the complexity of the problem increases, the time it takes to solve increases at a rate no greater than a polynomial would. Informally, we can say that for P problems, we can solve the problem “quickly”. (Polynomial time).

Here’s an example: Say we have a list of numbers in front of us, and we want to pick out the number that is the greatest. At the very worst, we could start at the beginning of the list and compare every number until we got to the end of the list. Yes, the list may be very very long. But the time it takes to check every number increases in a linear fashion. More numbers does not make it exponentially more difficult. If we compare the growth rates of linear time O(n) (graph A) and polynomial time O(n^2) (graph B) we can clearly see that solving this problem is quicker than polynomial time, thus it is in P.

yeqx2

Graph A

Graph B

Graph B

The Class NP

cookies

“Mom, she got more than me!”

The class of NP problems, put simply, can be verified in polynomial time. NP stands for “nondeterministic polynomial time” and harkens back to a computational construct known as a Turing Machine. Consider this example: a woman is dividing up cookies of different sizes into two groups for her two young children. Naturally, they are very picky about things being fair, so she needs to make sure that each child has exactly the same amount of cookies (by weight). While we can show that this problem can get quite hard to sit down and solve, if someone showed us two piles of cookies we could quickly add up the weights of the two piles and tell if they were the same or not. 

If you have only 2 cookies to separate out, sure, its easy. Put one cookie in each pile and that’s the best you can do. But what if you had 5 cookies? 10? Keep in mind that each cookie weighs a slightly different amount and we want the two groups of cookies to be equal in weight as much as possible. Adding more cookies to work with increases our options for separating them exponentially. This is not good. If she has 5 cookies, the number of different ways to separate them is 2^5 = 32, but if she doubles the amount of cookies to 10, the possibilities skyrocket because 2^{10} = 1024.

Let’s consider, for fun, the number of possibilities if she had 100 cookies! 2^{100}=1,267,650,600,228,229,401,496,703,205,376. This…is pretty large. But wait! Computers can do that in a heartbeat, right? They’re way faster than we are! This is all fine and good, until you realize that the number of seconds in the age of the entire universe is (only) 450,000,000,000,000,000. So even if our computer could go through thousands or millions of computations per second, it would still take longer than the entire age of the universe. This is what makes these sorts of problems so hard.

P = NP?

So now that you know a little about what P and NP are, you may be wondering what all this business about “P=NP” is. What does it mean, after all? If someone were to prove that P equals NP, this would mean that every problem in the class NP (the hard ones) can be solved in a polynomial time, just like the P problems. This would have huge implications for all sorts of applications, not just in the mathematical world. For instance, a cornerstone of many security systems rests on the difficulty of factoring very large numbers. If P=NP, it would mean that there exist very easy methods for factoring these numbers, and as such, financial systems everywhere would be vulnerable.

This can also be applied to many logistical and optimization problems (see my posts on intractable problems here and here). For many of the NP problems, finding an “easy” method to solve one can be applied to any of the problems in that class. We don’t need to show an easy method to solve every problem in the world. We only need one. That shouldn’t be too hard, right? Unfortunately, researchers have been working on this problem for decades without success. There have been many attempted “proofs” but they all ultimately have fallen short. The problem is so difficult that the Clay Mathematics Institute is offering $1,000,000 for a correct proof of this problem one way or another.

Although an official answer to this question has not been decided one way or another, many professionals believe that P \neq NP. This means that problems in NP will always be inherently “harder” than problems in P, and at the worst case require an exhaustive (brute force) search to find a solution. No matter how good our computers get, these problems will still be very difficult for us to solve, especially at large cases.

I hope you enjoyed this little introduction to P and NP and if you have any comments, questions or corrections, please leave them in the comments below!

Twitter FriendCloud

Today I’m going to talk about a personal project I’ve been working on recently. I was trying to come up with some way to make a cool project with natural language processing and I had also noticed that with the rise of social networks, there is a treasure trove of data out there waiting to be analyzed. I’m a fairly active user of Twitter, and its a fun way to get short snippets of info from people or topics you’re interested in. I know personally, I have found a lot of math, science, and tech people that have twitter accounts and post about their work or the latest news in the field. I find just on a short inspection that the people I follow tend to fall into certain “groups”:

  • math
  • computer science
  • general science
  • academia
  • authors
  • tech bloggers
  • feminism
  • various political and social activist figures (civil rights, digital privacy, etc)

This list gives a pretty good insight into the things I am most interested in and like hearing about on my twitter timeline. Now I thought to myself, what if I could automate this process and analyze any user’s timeline to find out what “groups” their friends fell into as well? I had my project.

Currently in the beginning stages, I decided to tentatively call my project “FriendCloud” and registered with the API to start messing around. I’m using Python-Twitter to interact with the twitter API, and its helping me to get practice with Python at the same time. The first thing I wanted to do was be able to pull down a list of all the people that I follow. Since I follow a little over 1000 people, this proved to be a daunting task with the rate limiting that Twitter has built in to their API. At the moment, what I had to do was get my script to pull as many user objects as possible until the rate limit ran out, then put the program to sleep for 15 minutes until the rates refreshed and I could download more.

It took a little over an hour to get the list of all my friends and I am trying to look into a way to do this quicker in the future. After that, I can go through users and pull down a selection of their tweets. After this is done, I have a corpus of text that I can analyze. I have been using NLTK (a Python NLP toolkit) to pick out some of the most common keywords and themes. There is a lot of extraneous data to deal with, but as I pare it down I’ve noticed some interesting trends just in my own tweets.

I hope in the future to be able to extend this to the people I follow on twitter and be able to place them into rough “groups” based on their most commonly tweeted keywords (similar to how a word cloud works). In this way, a user can get an at a glance look at what topics the person is most likely interested in and what sort of people they may be likely to follow in the future.

Intractable Problems — Part Two: Data Storage

This post continues my series on intractable problems. In this installment, I will talk about problems relating to Data Storage. As a refresher, remember that an intractable problem is one that is very computationally complex and very difficult to solve using a computer without some sort of novel thinking. I will discuss two famous problems related to Data Storage below, as well as provide a few examples and references.

Part Two — Data Storage

486px-Knapsack.svg

Knapsack — given a set of items with weights w and values v, and a knapsack with capacity C, maximize the value of the items in the knapsack without going over capacity.

To start with, here is a small example that I refer to in this previous post. If you’re trying to place ornaments on a tree, you want to get the max amount of coverage possible. However, the tree can only hold so much weight before it falls over. What is the best way to pick ornaments and decorations such that you can cover as much of the tree as possible without it falling over?

I actually saw a really interesting video describing an example of this problem in this video (watch the first few minutes). The professor here sets up a situation of Indiana Jones trying to grab treasures from a temple before it collapses. He wants to get the most valuable treasures he can, but he can only carry so much.

Class: There are several different types of knapsack problems, but the most common one (the one discussed above) is one-dimensional knapsack. The decision problem (can we get to a value V without exceeding weight W?) is NP-Complete. However, the optimization problem (what is the most value we can get for the least weight?) is NP-Hard.

References: 

Wikipedia Page – General discussion of the Knapsack problem, different types, complexity, and a high level view of several algorithms for solving

Coursera Course on Discrete Optimization – The source for the above video and a great discussion of not only Knapsack but quite a few of these problems

Knapsack Problem at Rosetta Code – a good example data set and a variety of implementations in different languages

============================

TetrisPieces2

Bin Packing — given a set of items with weights w and a set of n bins with capacity c each, place the items into the bins such that the minimum amount of bins are used.

This problem is very similar to the knapsack problem found above, but this time we don’t care how much the items are worth. We just want to pack them in the smallest area possible. Solving this problem is invaluable for things like shipping and logistics. Obviously, companies want to be able to ship more with less space.

A more commonplace example could be thinking of this problem as Tetris in real life. Consider that you’re moving to a new place,  and you have you and your friends car in which to move things. How can you place all the items in your cars such that you take up space the most efficiently?

Some progress has been made on reasonably large data sets by using what is called the “first fit decreasing algorithm”. This means that you pick up an item, and place it into the first bin that it will fit in. If it can’t fit in any of the current bins, make a new bin for it. Decreasing means that before you start placing items, you sort them all from biggest to smallest. You probably do this in your everyday life. If you want to pack a box, you start with the big items, right? No need to put lots of small items on the bottom. By getting the big items out of the way first, you can be more flexible with the remaining space because you will then have smaller items.

Class: This problem is NP-Hard.

References:

Wikipedia Page – a high level description of the problem

First Fit Decreasing Paper (pdf) – this is a technical paper describing computational bounds for using the first fit decreasing algorithm. Not for beginners.

3D Bin Packing Simulation – looks like a resource for companies to use to pack boxes and such

============================

I hope this provided a little taste of why these problems are so important. If you know any other good resources please let me know.

Intractable Problems — Part One: Set Problems

My professor and advisor Dr. Alice McRae provided a list of intractable problems for us to ponder in our genetic algorithms class, and I thought I would expand on some of them here for reference. All of these problems are intractable, which means that they are very, very difficult to solve precisely with a computer.

Most of the problems on this list are what is known as NP-Complete problems. If the complexity classes P and NP are not equal (as is widely believed by many researchers, but not proven), then NP-Complete problems cannot be solved by a computer in a reasonable time frame. In theory, with an infinite amount of time we could produce answers to these problems but time and computing power is finite, no matter how many technological advances we make.

We have seen some of these difficult problems before in previous posts: Genetic Algorithms for Ramsey Theory and The Travelling Santa Problem, as well as Introduction to Genetic Algorithms all have good examples of these types of problems.

I will be presenting these problems in multiple parts, with my comments and references on each one.

Part One — Set Problems

Maximum 3-Dimensional Matching — given a set S of ordered triples of the form (x,y,z), find the largest possible subset of the triples, such that no two elements in the triple share the same x, y, or z coordinate.

Here is a small example: Consider the set \{(1,4,5),(3,4,9),(6,7,8),(1,2,5)\}. We need to find the largest subset of these we can, and the x,y,z values cannot be repeated. In this example, the set \{(3,4,9),(6,7,8),(1,2,5)\} would be a solution. It contains 3 triples and none of the numbers are repeated. As you can imagine, once the sets gets larger, this problem becomes much more difficult.

In layman’s terms, consider a group of boys, girls, and pets. We want to make happy “triplets” of girls, boys, and pets, but no girl, boy or pet can be in more than one group. What is the best way to match these people and pets up so that we have the largest number of groups?

Class: 3-Dimensional Matching (finding any subset that satisfies the conditions) is NP-Complete. The optimization problem (finding the largest subset) is NP-Hard. [1]

References: 

More NP-Completeness Results (pdf) lecture notes from CMU, good explanation of 3DM as well as some other problems, proof of the class of 3DM.

NP-Complete Problems (pdf, pg 267)

============================

Subset Sum/Subset Product — given a set S of integers and a goal sum/product P, find a subset of S that sums/multiplies as close as possible to the goal of P.

This one doesn’t sound as hard, but at larger quantities, this can become very difficult.

Example: Consider the set S = \{ n |1\leq n \leq 100 \} and a goal sum G = 531. Now we need to find a set of numbers between 1 and 100 that we can add together to get 531.

A practical application of this: Say you’re a kid and your parents gave you $10 of allowance for the week. Naturally, you want to get as much good out of that $10 as you can. What is the best combination of things you can buy to add up to as close to $10 all together?

Class: Subset sum and subset product are NP-Complete. Proof can be found at [2]

References:

Subset Sum NP-Completeness (pdf) Scroll down a bit to see the proof that subset sum is in NP-Complete

Dynamic Programming Subset Sum — A description of a dynamic programming technique for this problem

SubsetSum@Home — A distributed crowdsourced BOINC-type initiative to solve subset sum problems

An Improved Low-Desnity Subset Sum Algorithm — a paper concerning algorithms to solve this problem

============================

Minimum Set Covering — given a set S and a collection C of subsets over S, find the fewest number of subsets of C such that all elements of S are represented.

Okay so lets break this down. We have a set S, lets say for example S=\{1,2,3,4,5\}. Now we have a collection C of subsets over S. For our example, C = \{\{1\},\{2,5\},\{3,4\},\{2,3,4\},\{5\}\}.  In this case, the smallest collection we can make that includes all the elements in S is \{\{1\},\{2,3,4\},\{5\}\} which contains 3 elements.

Practical example: You’re a kid again, and looking at a group of video games that are on sale. They are all in combo packs, however. How can you get all the games you want and spend as little money as possible? In other words, what is the smallest amount of combo packs you need to buy in order to get all of the games you want?

Class: the decision problem (does this set contain all the elements we’re looking for?) is NP-Complete. The optimization problem (is this set the smallest set we can make?) is NP-Hard. See [3].

References:

A Probabilistic Heuristic for a Computationally Difficult Set Covering Problem – (pdf) Journal article on this topic detailing a heuristic for finding set coverings

A Genetic Algorithm for the Set Covering Problem – (pdf) Journal article about a genetic programming approach to set cover

A Greedy Heuristic for the Set-Covering Problem – (pdf) Yet another approach

Set Cover Problem — Wikipedia

An Example: Set Cover – (pdf) Scroll down to see a proof of set cover in NP-Complete

============================

These are only a few intractable set problems, but there are many more variations of these out there. Stay tuned for the next segment in this series, problems about Data Storage (Bin Packing and Knapsack).

Introduction to Genetic Algorithms

In response to my previous article about genetic algorithms for Ramsey theory, a few readers asked me to give a bit more of an introduction about genetic algorithms. Here you will find a beginner’s look at what a genetic algorithm is, what it is useful for, and how you can use one in your own work.

What is a genetic algorithm (GA)?

To begin with, let’s talk about what a genetic algorithm is, on its most basic level. The word genetic suggests something to do with biology. If you can reach way back to your high school science classes you might remember the basic genetic process: plants and animals are born, mate with one another, and create new generations. Over time, these generations tend to emphasize certain traits essential to survival while downplaying the weaker (recessive) traits. A genetic algorithm works much the same way. We come up with a population of possible solutions to a problem, “mate” them together, and look at our new solutions to see if they are any better. Over time, we can create solutions that converge to better and better values. This is useful when a problem is too complex to search all possibilities. Below you will see an image describing how a simple GA works.

GeneticAlgorithms (1)

Crossovers, fitness, mutation, oh my!

Before moving on, it would be useful to define some of the terms used above.

  • chromosome:  a proposed “solution” to the problem at hand. This is usually represented by a bitstring, that is to say, a list of 0’s and 1’s. It can also hold any other information that is crucial to our problem.
  • population: a collection of these “chromosomes” that we use to combine together and make new generations. The population represents the set of all the ideas that we have about this problem at the moment
  • fitness: the fitness of a chromosome is a number representing how good it is. That is, the fitness represents how good this solution is at solving our problem. An example of this is if you were trying to find the shortest route to get somewhere. Fitness for a problem like this would be a number representing the distance it takes for each path you could choose. In the end, you want to find the path with the shortest distance. Ultimately, the fitness function is defined by the programmer, and it can measure whatever you want.
  • mutation: mutation is the random entering of new data into the gene pool. Just like in biology, sometimes mutations occur and create things that were never intended. However, sometimes this inadvertent change can be to our advantage if we’re getting stuck. A common example of a mutation is to just change a small part of the chromosome, and move on.
  • crossover: the “crossover function” represents the operations that we do in order to “mate” two (or sometimes more) of our chromosomes. There are many different types of crossover techniques, some better for certain situations than others, but at the heart of it a crossover just represents a way to combine the “traits” of two or more chromosomes into a new “baby” chromosome to be inserted into our population. This is the bulk of a genetic algorithms’ work. As the population evolves and new generations of solutions are created, the goal is to keep around the solutions with “good” fitness, and get rid of the chromosomes that aren’t doing so well. The crossover is also user-defined, and tweaking it to optimize results is common.
  • termination condition: all good things must come to an end. While we in theory could just leave our algorithm running forever, that would not be very helpful because as I will discuss later in this post, solutions can’t keep getting better forever. In addition, sometimes you won’t actually get to the “best solution” and instead will converge on what’s called a “local minima/maxima”. When this happens, it means you’ve found an “okay” solution, but the population got flooded with many similar chromosomes and couldn’t improve itself after that point. Think of it this way: if everyone in the world were exactly the same, would you expect any different from their children? The termination condition you choose depends on the problem, but some common terminating conditions are:
    • a) finding the best solution (ideal)
    • b) running a preset number of generations and using that as a cutoff point
    • c) quitting after every member of the population falls within a certain similarity range (this means that no new/better solutions are likely to be produced)

But why would I want to do all this?

I know this seems like a pretty complicated process. Why not just use a computer to figure out the real solution instead of dancing around it in this complicated manner? Well, it turns out that’s not always possible…

Enter the Traveling Salesman Problem (TSP). While there are many problems that are still very hard for us to solve with computers, this is one of the best known and most studied. It turns out that using a genetic algorithm is actually a pretty good approach and is much quicker than running an exhaustive search of all paths possible. Remember when I used the example of finding the shortest distance to go somewhere? That’s pretty much what this is. I’ve done a blog post on this before (see here). In the traveling salesman problem, there is a man that needs to visit a list of different cities, but he wants to get there and back as quickly as possible. Therefore, you need to find the shortest route to hit all the cities and return home. This isn’t always as easy as it sounds, and as the number of cities grows, so too does the time it takes to find the right one. Very quickly it becomes implausible to check every possible path, so we use a genetic algorithm to help us weed out the bad ones.

Another example of how we would use a genetic algorithm is for graph theory problems that also have a huge number of possible solutions. You can see how I applied a genetic algorithm to the problem of Ramsey numbers here.

Okay, so how do I use this in my project?

It’s pretty simple. I know it looks complicated, but once you get everything apportioned out correctly it’s not that bad. Things to think about when applying a GA to a hard problem (we will use the Traveling Salesman Problem as an example here):

  • First decide what your “chromosomes” will look like. These are the meat of your population, and these are what will be mutated and crossovered in order to create better solutions. For the TSP, we will use a a sequence of numbers denoting what cities to visit and in what order. (For example, “1 5 4 2 3 1” describes a way you could make a circuit through 5 cities).
  • Decide on how fitness will be evaluated. This is important because members with better fitness will be the ones that stick around in your population and (hopefully) make your solutions better. In our problem, fitness refers to how far our chosen path takes us. The lower the fitness score, the shorter our travel is. We want to minimize this number with our algorithm.
  • Next we need to figure out what our crossover will be. This is very important to consider, because we want something that will take parts of both of our “parents” and combine them together in some way to make a “baby”. For traveling salesman, we can’t just grab some from one parent and some from the other, because we run the risk of duplicates. (we don’t want a path to look like “1 2 2 3 4 1”, city 2 is visited twice and we never get to city 5). Therefore, we have to use more sophisticated methods. I won’t go into them here but if you’re interested check out this wikipedia page for more info on crossover techniques.

What are the downsides?

“Every rose has its thorn,” as they say, and genetic algorithms are no different. What seems like a cure-all has its hindrances as well. As I mentioned before, GAs have a problem of converging too early on a value that’s not quite ideal. Often this takes many repetitions of running the program and fine tuning things such as population size and mutation rate. Sometimes happening upon the “best” solution is a product of randomness. But in a GA, we use this randomness to our advantage as much as we can. Another inherent problem is the fact that we have to program this framework around it. Using a GA is only viable if its an extremely complex problem that cannot be solved more efficiently. For example, using a GA to solve a 5 city TSP would be foolish. We can run through all those possibilities very quickly. On the other hand, bumping that up to 50 or 500 cities proves a much harder challenge.

In addition, using a genetic algorithm means you have to find a type of chromosome, fitness function, population size, and crossover that works for you. Pick the wrong values for these, and the program will behave less than optimally. Experimentation and continually tweaking the parameters of your model is necessary. In some genetic algorithms, the fitness function for even one chromosome can take quite a while to compute. This makes some genetic algorithms very slow to apply.

Where can I learn more?

I hope this has covered the basics of genetic algorithms and interested you in learning more. If you would like to see a project I have done involving GAs you can read A Genetic Algorithm Approach to Ramsey Theory, and for a broader range of discussion about the theory and applications of GAs check out the book An Introduction To Genetic Algorithms by Melanie Mitchell.

Hope you enjoyed this introduction to the wide world of genetic programming; if you have questions or suggestions please leave them in the comments below!

Kinect Face Tracking — Results: Thesis Update #4

thesis pic

 

For background, see my three previous posts in this series:

  1. Facial Expression Analysis With Microsoft Kinect : Thesis Update #1
  2. Some Faces : Thesis Update #2
  3. Animation Units for Facial Expression Tracking : Thesis Update #3

My thesis has been successfully completed and defended now, and I am currently on break for the summer. I thought I would make a post to wrap up some loose ends and talk about some things I didn’t have a chance to talk about before. In my last post, I discussed the significance of animation units to my facial tracking algorithm. Now the way I use these units is pretty simple, but can lead to some complex classifications. The first thing to do is to consider what the desired face would look like. Picture it in the mind’s eye. Say for example we wanted to recognized a surprised face. How would this look? Chances are, when thinking of a surprised face, the mind visualizes raised eyebrows, wide open eyes, and a gaping mouth. The next task (and the bulk of my algorithm) asks: How do we translate these facial movements into the points tracked on our facial mesh?

We use animation units as “wrappers” for sets of points on the face plane. Instead of having to track and check multiple different places on the eyebrows or mouth for example, the animation units allow us to track the movements of those features as a whole. Since the animation unit values lie between -1 and 1, we must assign “bounds” for each expression to where if the user’s features fall within that range, we can assume the user is creating that expression. These values at present are determined by extensive testing and seeing what values are frequently displayed when creating a certain expression. It would not be difficult to build a classifier for these expression bounds, and use it to train the program over multiple different faces and expressions in order to get the best and most accurate data for each type of face.

In my application, we track six different types of expressions.

surprised

surprised

sad

sad

kissing face

kissing

happy

smiling

In addition, we look for two different types of angry faces: angry with a closed mouth (glaring at someone) or angry with an open mouth (as in yelling).

angry-closed mouth

angry – closed mouth

angry-open mouth

angry – open mouth

To see a simple flowchart detailing some preliminary bounds for each expression (not exhaustive), check out the chart here (click for larger view).

ThesisLogic (1)

There is a bit of a “lag” in my application on recognizing these expressions, because the live video stream captures many frames each second, and there is a tiny bit of delay in figuring out what expression that frame’s data fits into. As such, the output of my program is a bit inaccurate still. Because it prints off what the expression every frame, there can be a bit of a buildup and after a while it will start showing expressions at a bit of a delay. For example, if the user acts surprised sometimes the program will not actually print “surprised” for a fraction of a second afterwards, because its busy trying to run through the frames as they come in. A simple remedy to this would be to create a “buffer” of tracked points and use the average of the data over a few seconds in order to determine the facial expression. Because the camera is very sensitive, we are prone to having the data change at the slightest movement of the face. Indeed, even trying to sit as still as possible still results in some small changes in the data. Another thing I noticed that creating a buffer of data could help solve is when the camera loses track of the face for only a moment, it begins to spit out garbage data as it attempts to relocate the face.

Overall we can see a good proof of concept of the capabilities of the Kinect Face Tracking API and there is a lot of room for improvement in the future. Possible future additions/enhancements include:

  • tracking a wider range of expressions
  • wider range of accessibility (glasses/hats, children, elderly people)
  • more specific bounds for facial expressions, use neural networks or something
  • more interactive application
  • use facial expression recognition to interface with other environments (i.e., call a command by smiling)