ACM

Communications of the ACM

ACM News

Detecting Deception

By Sandrine Ceurstemont
Commissioned by CACM Staff
July 27, 2021
Comments

View as: Print Mobile App Share:

Scarlett Johansson and Jimmy Fallon playing Box of Lies. — Automated systems that use machine learning can perceive lying better than a polygraph.

Credit: Douglas Gorenstein/NBC

People are not good at detecting when someone is lying. Studies have shown that our ability to perceive deception is barely greater than chance. Wasiq Khan, a senior lecturer in artificial intelligence and data sciences at Liverpool John Moores University in the U.K., thinks that is partly because it requires the ability to identify complex clues in speech, facial movements, and gestures, attributes that he says "cannot be observed by humans easily."

Automated systems that use machine learning may be able to do better. Khan and his colleagues developed such a system while working on a project for the EU, where the aim was to explore new technologies that could be put in place to improve border control. They examined whether deception could be detected automatically from eye and facial cues, such as blink rate and gaze aversion. "I wanted to investigate whether face movements or eye movements are important," says Khan.

The team recorded videos of 100 participants to use as their dataset. The volunteers were filmed while role-playing a scenario that might occur at a nation's port of entry, in which they are asked about what they had packed in their suitcase by an avatar controlled by the researchers in another room. Half of the participants were asked to lie, and the other half were told to be truthful.

The videos were then analyzed using an automated system called Silent Talker. It examined each video frame and used an algorithm to extract information from the interviewees about 36 face and eye movements. Results were noted in binary format where 1 could be assigned when the person's eyes were closed, for example, and 0 if they were open. The team them tried to determine which facial and eye features were correlated with deception by using various clustering algorithms. "The video analysis is complex," says Khan.

The algorithms identified features that seemed to be most important for detecting deception, which all involved tiny eye movements. The team then trained three machine learning algorithms using both the more significant features and the total set of attributes. Eighty percent of the dataset was used for training, including 40 truthful and 40 deceitful interviews, while the remaining 20% was held for testing.

Khan and his colleagues found the machine learning methods were all able to predict deception quite well from the identified features. Overall accuracy ranged from 72% to 78% depending on the method, where the greatest accuracy was obtained by focusing solely on eye movements. "We identified that eye features are important and contain significant clues for deception," says Khan.

However, Khan thinks the study could be improved by using a better dataset. As with most studies of this nature, the deceptive and truthful videos they used were not captured in real-world situations. Since people were told to be deceptive or honest, they may therefore not be exhibiting authentic behaviors. "If you ask me to lie and I know that there are no consequences, I may not reveal those facial, visual, or verbal clues," says Khan. "This is one of the major limitations."

Khan thinks capturing data from realistic environments in which people are lying could create more robust machine learning models. Videos could be filmed during court cases or at actual border control locations when people are being questioned, for example.

Including other types of cues also could improve deception detection models. Khan now is using a different approach by focusing on the pupil of the eye, since he thinks that could provide more information about the whole gaze, and allow tiny eye movements to be detected more accurately. He also plans to incorporate head and face movements, as well as clues from speech using a deep learning approach, which he thinks may be able to extract novel indicators of deception that are potentially more reliable. "I have a longer-term goal where I want to create a hybrid model," says Khan. "I want to collect a dataset in my own way as well, so that it is more realistic."

Another team of researchers has developed a machine learning model that detects deception using a different combination of cues. Julia Hirschberg, a professor of computer science at Columbia University in New York City, and her colleagues focused on facial expressions and different features in speech in recent work. While humans typically manually annotate videos used to train the system by picking out relevant features, Hirschberg and her team tested whether an automated system could be used instead. "(Annotating) is very time-consuming," says Hirschberg. "It's always useful to see whether automatic methods work as well."

Hirschberg and her colleagues used video clips from The Tonight Show Starring Jamie Fallon as their dataset, since they were given access to them. They featured conversations with 28 guests while they played a game called Box of Lies, in which a player either gives a truthful description of what is in a box hidden from the other player's view, or describes a fictional object; the other players had to guess if they were lying or telling the truth. "We thought it was a pretty cool corpus," says Hirschberg.

The team used two different computer vision approaches to extract facial features in the videos. They also had access to annotations for the videos that had previously been done by people, in which seven different categories of facial expressions had been extracted from the clips. They decided to use them as well, to compare the performance of the automated methods.

To pick out features in speech, the audio portion of the videos was first transcribed by a speech recognition system. After that, the researchers used various automated methods to extract different features, such as how frequently certain words were used.

Hirschberg and her colleagues then conducted experiments using a machine learning model called Random Forest to see how well it could detect deception. The model was trained on each individual set of features, as well as on a dataset that combined attributes from both speech and facial features. They found that Random Forest was slightly better at identifying deceptive behavior when trained on automatically extracted facial features, compared to those that were manually annotated. However, the model performed the best, with an accuracy of 73%, when it was trained on features from different modalities.

Hirschberg thinks using both facial and speech features can improve deception detection, since some people may give clues in one modality but not the other. "I'm sure you know such people: their face doesn't move, but you can tell from their voice," says Hirschberg, "so it's very useful to train on multi-modal features."

Automated systems that can detect deception could have many potential applications. They could help the police with law enforcement, for example, while insurance companies are also interested in such systems to help detect when people are calling them with a false claim. "Mostly we've worked for the government, but lots of people have contacted us," says Hirschberg.

She thinks machine learning is a promising approach, as long as there is good test data available for each specific situation. At the same time, she said, different machine learning approaches should be tested to find one best suited to the type of data available, such as audio from a phone call with background noise versus video in a controlled setting.

"Machine learning is very flexible and broad," says Hirschberg. "It's an automatic method for taking data and ending up with some good results."

Sandrine Ceurstemont is a freelance science writer based in London, U.K.

No entries found