Facial Emotion Recognition (FER) is the technology that analyses facial expressions from both static images and videos in order to reveal information on one’s emotional state. The complexity of facial expressions, the potential use of the technology in any context, and the involvement of new technologies such as artificial intelligence raise significant privacy risks.
1. What is Facial Emotion Recognition?
Facial Emotion Recognition is a technology used for analysing sentiments by different sources, such as pictures and videos. It belongs to the family of technologies often referred to as “affective computing”, a multidisciplinary field of research on computer’s capabilities to recognise and interpret human emotions and affective states and it often builds on Artificial Intelligence technologies.
Facial expressions are forms of non-verbal communication, providing hints for human emotions. For decades, decoding such emotion expressions has been a research interest in the field of psychology (Ekman and Friesen 2003; Lang et al. 1993) but also to the Human Computer Interaction field (Cowie et al. 2001; Abdat, Maaoui, and Pruski 2011). Recently, the high diffusion of cameras and the technological advances in biometrics analysis, machine learning and pattern recognition have played a prominent role in the development of the FER technology.
Many companies, ranging from tech giants such as NEC or Google to smaller ones, such as Affectiva or Eyeris invest in the technology, which shows its growing importance. There are also several EU research and innovation program Horizon2020 initiatives1 exploring the use of the technology.
FER analysis comprises three steps: a) face detection, b) facial expression detection, c) expression classification to an emotional state (Figure 1). Emotion detection is based on the analysis of facial landmark positions (e.g. end of nose, eyebrows). Furthermore, in videos, changes in those positions are also analysed, in order to identify contractions in a group of facial muscles (Ko 2018). Depending on the algorithm, facial expressions can be classified to basic emotions (e.g. anger, disgust, fear, joy, sadness, and surprise) or compound emotions (e.g. happily sad, happily surprised, happily disgusted, sadly fearful, sadly angry, sadly surprised) (Du, Tao, and Martinez 2014). In other cases, facial expressions could be linked to physiological or mental state of mind (e.g. tiredness or boredom).
Figure 1: Steps of Facial Emotion Recognition
The source of the images or videos serving as input to FER algorithms vary from surveillance cameras to cameras placed close to advertising screens in stores as well as on social media and streaming services or own personal devices.
FER can also be combined with biometric identification. Its accuracy can be improved with technology analysing different types of sources such as voice, text, health data from sensors or blood flow patterns inferred from the image.
Potential uses of FER cover a wide range of applications, examples of which are listed here below in groups by their application field.
Provision of personalised services
- analyse emotions to display personalised messages in smart environments
- provide personalised recommendations e.g. on music selection or cultural material
- analyse facial expressions to predict individual reaction to movies
Customer behaviour analysis and advertising
- analyse customers’ emotions while shopping focused on either goods or their arrangement within the shop
- advertising signage at a railway station using a system of recognition and facial tracking for marketing purposes
- detect autism or neurodegenerative diseases
- predict psychotic disorders or depression to identify users in need of assistance
- suicide prevention
- detect depression in elderly people
- observe patients conditions during treatment
- help decision-making of recruiters
- identify uninterested candidates in a job interview
- monitor moods and attention of employees
- monitor students’ attention
- detect emotional reaction of users to an educative program and adapt the learning path
- design affective tutoring system
- detect engagement in online learning
- lie detectors and smart border control
- predictive screening of public spaces to identify emotions triggering potential terrorism threat
- analysing footage from crime scenes to indicate potential motives in a crime
2. What are the data protection issues?
Due to its use of biometric data and Artificial Intelligence technologies, FER shares some of the risks of using facial recognition and artificial intelligence. Nevertheless, this technology carries also its own specific risks. Being a biometrics technology, where aiming at identification does not appear as a primary goal, risks related to emotion interpretation accuracy and its application are eminent.
2.1 Necessity and proportionality
Turning human expressions into a data source to infer emotions touches clearly a part of peoples’ most private data. Being a disruptive technology, FER raises important issues regarding necessity and proportionality.
It has to be carefully assessed, whether deploying FER is indeed necessary for achieving the pursued objectives or whether there is a less intrusive alternative. There is risk of applying FER without performing necessity and proportionality evaluation for each single each case, misled by the decision to use the technology in a different context. However proportionality depends on many factors, such as the type of collected data, the type of inferences, data retention period, or potential further processing.
2.2 Data accuracy
Analysis of emotions based on facial expressions may not be accurate, as facial expressions can slightly vary among individuals, may mix different emotional states experienced at the same time (e.g. fear and anger, happy and sad) or may not express an emotion at all. On the other hand, there are emotions that may not be expressed on someone’s face, thus inference based solely on facial expression may lead to wrong impressions. Additional factors can add to the ambiguity of the facial expressions, such as contextual clauses (sarcasm), and socio-cultural context. In addition, technical aspects (different angles of the camera, lighting conditions and masking several parts of the face) can affect the quality of a captured facial expression.
Furthermore, even in the case of accurate recognition of emotions, the use of the results may lead to wrong inferences about a person, as FER does not explain the trigger of emotions, which may be a thought of a recent or past event. However, the results of FER, regardless of accuracy limitations, are usually treated as facts and are input to processes affecting a data subject’s life, instead of triggering an evaluation to discover more about their situation in the specific context.
The accuracy of the facial emotion algorithm results can play an important role in discriminating on grounds of skin colour or ethnic origin. Societal norms and cultural differences have been found to influence the level of expression of some emotions while some algorithms have been found to be biased against several groups, based on skin colour. For instance, a study testing algorithms of facial emotion recognition revealed they assigned more negative emotions (anger) to faces of persons of African descent than to other faces. Furthermore, whenever there was ambiguity, the former were scored as angrier (Rhue, 2018).
Choosing the right dataset that is representative is crucial for avoiding discrimination. If the training data is not diverse enough, the technology might be biased against underrepresented population. Discrimination triggered by faulty database or by errors in detecting the correct emotional state may have serious effects, e.g. inability to use certain services.
In another aspect of the same problem, in case of medical conditions or physical impairments in which temporary or permanent paralysis of facial muscles occurs, data subjects’ emotions may be misunderstood by algorithms. This may result in a wide range of situations of misclassification, with impact ranging from receiving unwished services up to misdiagnosis of having a psychological disorder.
2.4 Transparency and control
Facial images and video can be captured anywhere, thanks to the ubiquity and small size of cameras. Surveillance cameras in public spaces or stores are not the only cameras remotely capturing facial images as one’s own mobile devices can capture expressions during their use. In these situations, transparency issues arise concerning both the collection and the further processing of personal data.
Where the data subjects’ facial expressions are captured in a remote manner, it may not be clear to them which system or application will process their data, for which purposes, and who the controllers are. As a result, they would not be in the position to freely give consent or exercise control over the processing of their personal data, including sharing with third-parties. Where data subjects are not provided with accurate information, access and control over the use of FER, they are deprived of their freedom to select which aspects of their life can be used to affect other contexts (e.g. emotions in social interactions could be used in the context of recruitment). Moreover, data subjects need to control which periods of time their captured data will be processed and aggregated to history records of their emotional situation, as emotion inferences may not be valid for them after a period of time.
Another consequence of the remote capture of facial expressions and the obscurity of their processing is that data subjects might not be provided with information on which other sources of data these will be aggregated to. Also, advanced AI algorithms add to the complexity of transparency needs, as they may detect slight movements of facial muscle that are unconscious even for the individuals. This would contribute to the unpleasant feeling of vulnerability due to unwanted exposure.
2.5 Processing of special categories of personal data
FER technology can detect the existence, changes or total lack of facial expressions, and link this to an emotional state. As a result, in some contexts, algorithms may infer special categories of personal data, such as political opinions or health data. For instance, applying FER technology at political events, political attitudes can be inferred by looking at facial expressions and reactions of the audience. Also, by the lack of facial expressions, algorithms are able to detect signs of alexithymia, a state in which one cannot understand the feelings they experience or lack the words to describe these feelings. This finding can be linked to severe psychiatric and neurological disorders, such as psychosis. Furthermore, analysis of historical data on one’s emotional state may reveal other health conditions such as depression. Such data, if used in the context of healthcare, could assist in prediction and timely treatment of a patient. However, where data subjects are not able to control the flow of derived information and its use in other contexts, they may face a situation of inference and use of such sensitive personal data by non-authorised entities, such as employers or insurance companies.
2.6 Profiling and automated decision-making
FER technology can be further used to create profiles of people in a number of situations. It could be used to derive one’s acceptance of a product, an advertisement or a proposed idea. It can also be used for classifying productivity and fatigue-resistance in workplaces. The risk lies in the fact that the data subject may not be aware of this type of targeting and might feel uncomfortable if they found out about it. Further implications can occur by erroneous profiling or inferences solely based on the association with a certain group of people experiencing the same emotions.
In addition, the knowledge of the individuals’ emotions can make it easier to manipulate them. For instance, the knowledge of emotions revealing a vulnerable emotional state, can be used to mentally force people to perform actions they would not do otherwise – e.g. to buy goods they do not need.
FER technology could be used for purposes of safeguarding public security, for instance at concerts, sport events or airports, to quickly identify signs of aggression and stress and identify potential terrorists. However, if such an identification was based solely on FER and was not combined with other actions or triggers that this person is dangerous, this could introduce further risks for the data subjects. For instance, a person could be subject to unjustified delays to perform further security checks or investigations, causing them to miss participation in an event, boarding on a flight or even lead to unjustified arrest.
Last but not least, FER can influence behavioural changes in case a person is aware of the exposure to this technology (known as Reactivity in psychology). Individuals may alter their habits or avoid specific areas where the technology is applied in an attempt to self-sensor and protect themselves. One can imagine the chilling effect this could have to a society and the feeling of insecurity among citizens, if such a technology were to be used by non-democratic governments, to infer political attitude of citizens.