Fujitsu Develops AI Model to Determine Concentration During Tasks Based on Facial Expression

Fujitsu Laboratories Ltd., Kawasaki, Japan, March 01, 2021: Fujitsu Laboratories Ltd. announced the successful development of a new, general purpose AI model for estimating concentration levels that can capture and quantify with high accuracy the degree of concentration when a person is performing various tasks. The model accomplishes this by detecting subtle changes in muscle movements that reveal differences in facial expression that occur when a person is concentrating or not.

Conventionally, models that use AI to quantify concentration have been created by training algorithms to recognize the expressions and behaviors of people performing specific tasks, such as e-learning. Since facial expressions and behavior differ depending on the tasks involved and the cultural background in which each person grew up, however, the models created had to be individual models, and the challenge was to develop individual AI models for different, specific situations.

Fujitsu has succeeded in the development of an AI solution that can identify common features that indicate concentration or non-concentration not easily influenced by subjects’ cultural backgrounds. The AI leverages proprietary technology that detects action units (AU) ⁽¹⁾, which express the "units" of movement corresponding to each muscle or each muscle group of the face based on an anatomically based classification system, with the world’s highest accuracy ⁽²⁾. The technology captures changes over a short period of a few seconds, such as a tense mouth, and long-term changes over periods of tens of seconds, such as staring intently, with time frames optimized for each action unit. Data was collected from a total of 650 people from a variety of regions including the United States and China, in addition to Japan, engaged in tasks like memorization and searching that require concentration to create a machine learning data set. This was used to create a general purpose AI model that can determine levels of concentration without relying on task-specific behaviors. The effectiveness of this model was subsequently verified using this data set, and it was confirmed that subjects’ degree of concentration could be quantitatively estimated with an accuracy rate of over 85%.

Ultimately, this technology delivers AI support that offers the possibility of using accurate data about concentration and attention to improve the efficiency and productivity of peoples’ activities online, as more and more aspects of life move online amidst the COVID-19 pandemic.

Newly Developed Technology

Fujitsu has developed a general purpose AI model that quantifies concentration levels without depending on the detection of a specific task and regardless of cultural background by leveraging unique technology that detects facial expressions through Action Units with the world's highest accuracy.

Using proprietary Action Unit detection technology that accurately learns the relative changes of facial expression muscles by training on a pair of images in which the intensity of movements of facial expression muscles differs, it becomes possible to capture changes over a short period of a few seconds, such as a tense mouth, and long-term changes over a period of several tens of seconds, such as staring intently, in time frames optimized for each action unit. A highly accurate concentration estimation AI model was then developed using a new method of integrated concentration estimation (Fig. 1).

This AI model was trained on a data set based on results of memorization and searching tasks that require concentration, drawing from a diverse pool of 650 people from Japan, the U.S., and China. The resulting general purpose AI model can quantify the degree of a subject’s concentration or lack of concentration for a variety of tasks, such as whether someone is concentrating during e-learning, the degree to which someone is immersed in desk work, or the concentration levels of people engaged in plant assembly work, on a range from 0.0 (complete lack of concentration) to 1.0 (maximum concentration).

Fig. 2 Concentration Estimation Overview

Outcome

In order to verify the versatility of the AI model, Fujitsu constructed a data set for a total of 650 people from a variety of regions, including in Japan as well as in the U.S. and China. Using the newly developed AI model, the concentration levels of participants in each country was estimated for a series of tasks, and it was possible to estimate the degree of concentration with more than 85% accuracy. This result is comparable to or higher than the results from the latest international academic conferences quantifying the degree of concentration of students on e-learning tasks. It was confirmed that this method works effectively despite possible variations in cultural background.

In addition, when the developed AI model was evaluated using data that included both concentration and non-concentration due to drowsiness, which was recorded by a drive simulator, it showed a high correlation with the correct data labeled based on a national Japanese research organization’s index for measuring sleepiness ⁽³⁾, and it was confirmed that the decline in concentration due to sleepiness could be estimated. This confirms that the AI model can be applied to different tasks that it is not specifically trained on.

Future Plans

In the future, in order to expand the application of this technology to various services such as online classes, online meetings, and sales activities, which are expanding globally amidst the “New Normal” , we will further promote the rigorous verification of such technologies from the perspective of AI ethics, with the aim of realizing the practical use of trust-worthy AI technologies.

[1]Action Unit
The movement unit of each part of the face corresponding to about 40 kinds of facial expression muscles defined in the Facial Action Coding System (https://www.paulekman.com/facial-action-coding-system/) proposed based on the anatomical knowledge. Each Action Unit is defined by five levels of intensity corresponding to the movement of the facial muscles.
[2]with the world's highest accuracy
Action Unit recognition technology that won first place in the competition for action unit detection accuracy held at the IEEE International Conference on Automatic Face & Gesture Recognition (FG 2020).
[3]a national Japanese research organization’s index for measuring fatigue
The NEDO Sleepiness Index, which measures sleepiness in a person being monitored by multiple observers. The level of sleepiness is defined by five levels.

News Source: https://www.fujitsu.com/