Animation Units for Facial Expression Tracking – Thesis Update #3

See the previous posts in this series:

  1. Facial Expression Analysis With Microsoft Kinect  : Thesis Update #1
  2. Some Faces : Thesis Update #2

So the past few months, I’ve been hard at work on my thesis concerning facial expression analysis with Microsoft Kinect. Unfortunately my blog has suffered quite a bit in wake of everything I’ve had going on, but I’m trying to post a few new things as the school semester winds down.

My ongoing project concerning Facial Expression Analysis with Kinect is making progress, and as the semester winds down, I am trying to prepare a final product to present. As I described in my last thesis update, I had been able to test a sample product which overlays a 3D mesh on the user’s face and tracks the different points on the face. In particular, a subset of these points are called Animation Units, or AUs, and are essential to the expression recognition algorithm. The points are defined based on the Candide3 model and the different values are delineated on Microsoft’s API page about the Kinect Face Tracking SDK. There are six different values that describe movements and placement of the basic facial features such as eyes, eyebrows, and mouth, and the table of values for the AUs is reproduced below (from Microsoft’s Kinect API page):

AU Name and Value AU Value Interpretation
AU0 – Upper Lip Raiser 0=neutral, covering teeth
1=showing teeth fully
-1=maximal possible pushed down lip
AU1 – Jaw Lowerer 0=closed
1=fully open
-1=closed, same as 0
AU2 – Lip Stretcher 0=neutral
1=fully stretched (joker’s smile)
-0.5=rounded (pout)
-1=fully rounded (kissing mouth)
AU3 – Brow Lowerer 0=neutral
-1=raised almost all the way
1=fully lowered (to the limit of the eyes)
AU4 – Lip Corner Depressor 0=neutral
-1=very happy smile
1=very sad frown
AU5 – Outer Brow Raiser 0=neutral
-1=fully lowered as a very sad face
1=raised as in an expression of deep surprise

When these values are considered, I was able to determine a set of bounds for what expression would represent in my application. However, I learned quickly that the extremal values of -1 and 1 are just that, extremal. Even sitting in front of the camera making the most exaggerated faces I could, it was very difficult to get above a .5 or .6 in some areas. In addition, the eyebrow data was almost always inaccurate in my testing because I wear glasses and this confused the camera. The Kinect saw the top of my glasses as my eyebrows and thus, showed not very much movement at all. When I took my glasses off, placement and tracking returned to normal, but it was impossible for me to see the data being processed.

Another unforseen problem that I ran into was that the camera is perhaps a bit too sensitive, and since it is tracking many frames every second, even if you sit as still as possible there will still be some variation in the tracked points, and if the camera loses track of the face entirely, the points can really go haywire. Therefore, trying to keep the points within specific bounds is more difficult than expected. If the camera loses track of the face the application can say you went from surprised to sad to angry in less than a second. It will probably not get done in this version of the project, but it would be useful in the future to store a buffer of points perhaps every second or so so that it stabilizes the constantly changing data and provides more time for processing before changing frames completely.

Overall I’ve certainly learned a lot throughout the course of this project and I would like to continue to work on it more throughout the summer. I will post a few more thoughts on the process and results as I carry on and wrap things up. I’ll have a copy of my thesis online at some point too.

In other news, I’m going to be starting the Master’s program in Computer Science at Appalachian State University in the fall, and I’m pretty excited about it. I plan to do research on algorithms and graph theory.

Stay tuned…

UPDATE: See the next post in this series here: Kinect Face Tracking — Results :  Thesis Update #4


5 comments on “Animation Units for Facial Expression Tracking – Thesis Update #3

  1. Shayaan Khan says:


    Thanks for an in-depth overview, it helped a lot 🙂

    I am working on a similar project but i am trying to morph different blendshapes (Key face expressions) in Maya (see images linked below: ).

    Currently, I am using MS visual studio to get Action Units from Kinect Face Tracking SDK to Autodesk Maya through a UDP protocol to animate a 3D deformable face model.

    I tried my best but need some code help, I am not good at coding.
    Here is my C# code so far:

    I hope it would help you 😛

    If you can shear your code or the method how differentiate AUs values to tiger different facial expressions.

    • Hello Shayaan,

      Here is a graphic depicting some very rough estimates of AU bounds for facial expressions. It could do with being more refined, but basically just trying to think about what a facial expression looks like and what sort of AUs we need to test.

      Some of these are more accurate than others.

      You can find the code I used on github:

      I hope this helps you. Thanks for stopping by!

  2. […] some loose ends and talk about some things I didn’t have a chance to talk about before. In my last post, I discussed the significance of animation units to my facial tracking algorithm. Now the way I use […]

  3. […] Animation Units for Facial Expression Tracking : Thesis Update #3 […]

  4. […] Animation Units For Facial Expression Analysis : Thesis Update #3 […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s