Recognition of facial expressions using Artificial Neural Network trained with synthetic data

  • PUBLISHED ON: 8 July 2021
  • READ TIME: 4 mins
  • TECHNOLOGIES: C#, Unity, Python, PyTorch, OpenCV, and Redis

About the Project

The project goal was to resolve a common issue in artificial neural networks, more exactly, the lack of data for training an artificial neural network. For example, training an artificial neural network that detects emotions requires a set of images that should contain 20,000 - 30,000 images. This project has the goal to generate a set of images artificially made and to prove that the artificial neural network trained with this would have a great accuracy on a set of real ones. The project can be viewed as two modules, one to generate images and the other one to process images, training, and testing the artificial neural network.

The module to generate images

To be able to generate images with some emotions it requires an environment that can simulate the human being and its facial muscle movements. The Unity game engine with The Heretic: Digital Human asset was picked for this project.

Digital Human

Gawain is the main character from The Heretic: Digital Human and it was created after the British actor Jake Fairbrother. The fact that it was free of charge it was the main reason for using this resource and also there already was an implemented facial animation system.

Emotions used

This project confined into 3 emotions: anger, happiness and surprise. The set of images contains a mixture of emotions (e.g. very angry and a little surprised or happy and a little bit angry) to bring variety to the images as much as possible. The rules from the "Emotions Revealed. Understanding Faces and Feelings" by Paul Ekman were respected in order to implement feelings.

How does it work?

Each type of emotion is represented by a class that applies pulling forces on certain muscles. By gathering these forces, a lot of emotions can join together. Applying a sort of feeling requires a value of intensity.

From a pool of emotions is extracted one that is considered the main and then, a random intensity value between 70% and 95% is attributed to it. The rest of them are considered secondary and a random intensity value is attributed to each of them so that the sum is 100%.

Once the emotions have been applied, the camera and the light are moving randomly to diversify the images and at the end the digital character is photographed. The image and the distribution of emotions are sent to the neural network module.

The source code can be viewed here, and below there are some examples of generated images.

The module of artificial neural network

The module has been written in Python using PyTorch library to implement the artificial neural network and OpenCV library to precisely extract the face. There aren’t a lot of things to say about this module without using technical terms or programming knowledge. The basic idea is that in this module the images are processed and the neural network is trained and tested.

In this project, the training was realized with 90,000 artificial images and was tested with 27,000 artificial images. The accuracy for these images is 99.9%, but this result was expected.

For testing the artificial neural network 60 real images were used. Real data set is made up of 20 images with a face expressing anger, 20 that express happiness and 20 that express surprise. The obtained accuracy is 95.9%, which is a great result, but some people would say that the number of images was too little.

The source code can be viewed here.

The conclusion

The number of images for real testing was indeed small, but the project was created with a limited number of resources and, unfortunately, Jake Fairbrother didn't respond to my request of sending me some images with him interpreting a few emotions. Also, I contacted a few universities to give me access to their database, but with no response in return.

My personal opinion is that an artificial neuronal network can be trained using a set of artificial data and can work well on a real set of data. Whoever has access to a database of images containing emotions used in this project can check the accuracy of this project. Of course, it should be considered the fact that only one pattern was used and the diversity of images was limited.

Do you have a similar project in mind?