Tech and entertainment companies are betting big on facial recognition technology and Disney wants to be the cool kid on the block.

 

The company's research team is using deep learning techniques to track the facial expressions of an audience watching movies in order to asses their emotional reactions to it.

Called “factorised variational autoencoders” (FVAEs), the new algorithm is so sharp that is reportedly able to predict how a member of the audience will react to the rest of a film after analysing their facial expressions for just 10 minutes.

In a more sophisticated version to recommendation systems for online shopping used by Amazon — which suggests new products based on your shopping history — the FVAEs recognise a series of facial expressions from the audience, such as smiles and laughter.

Then, they make connections between viewers to see if a certain movie is getting the wanted reactions at the right place and time. Basically, whether you're laughing when you're supposed to laugh during "Inside Out", or you're yawning instead.

IMAGE: DISNEY RESEARCH

 

"The FVAEs were able to learn concepts such as smiling and laughing on their own,"  Zhiwei Deng, a Ph.D. student at Simon Fraser University (who served as a lab associate at Disney Research) told Phys.org. "What's more, they were able to show how these facial expressions correlated with humorous scenes."

Disney's research team used a 400-seat theatre equipped with four infrared cameras to film the audience during 150 showings of nine mainstream movies, such as "The Jungle Book", "Big Hero 6", "Star Wars: The Force Awakens" and "Zootopia".

The result was a staggering dataset of 16 million facial landmarks by 3,179 audience members which was fed to the neural network.

Variational autoencoders like the FVAEs work by automatically translating these data points into a series of numbers representing specific features — how much a face is smiling, how wide open eyes are, etc…

These numbers are connected to other bits of data through metadata, allowing the system to assess how an audience is reacting to a movie.

With the right training, the system was able to predict the expression a singular face would make at various points in the movie, after just a few minutes.

“Understanding human behavior is fundamental to developing AI systems that exhibit greater behavioral and social intelligence,” said Yisong Yue of Caltech, which collaborated with Disney in developing the deep-learning software.

“For example, developing AI systems to assist in monitoring and caring for the elderly relies on being able to pick up cues from their body language. After all, people don’t always explicitly say that they are unhappy or have some problem.”

The project was presented at the IEEE’s Computer Vision and Pattern Recognition conference in Hawaii.

 
www.pdf24.org    Send article as PDF