VaBTFER: An Effective Variant Binary Transformer for Facial Expression Recognition

Author:

Shen Lei1,Jin Xing1

Affiliation:

1. College of Information Science and Technology, Nanjing Forestry University, NanJing 100190, China

Abstract

Existing Transformer-based models have achieved impressive success in facial expression recognition (FER) by modeling the long-range relationships among facial muscle movements. However, the size of pure Transformer-based models tends to be in the million-parameter level, which poses a challenge for deploying these models. Moreover, the lack of inductive bias in Transformer usually leads to the difficulty of training from scratch on limited FER datasets. To address these problems, we propose an effective and lightweight variant Transformer for FER called VaTFER. In VaTFER, we firstly construct action unit (AU) tokens by utilizing action unit-based regions and their histogram of oriented gradient (HOG) features. Then, we present a novel spatial-channel feature relevance Transformer (SCFRT) module, which incorporates multilayer channel reduction self-attention (MLCRSA) and a dynamic learnable information extraction (DLIE) mechanism. MLCRSA is utilized to model long-range dependencies among all tokens and decrease the number of parameters. DLIE’s goal is to alleviate the lack of inductive bias and improve the learning ability of the model. Furthermore, we use an excitation module to replace the vanilla multilayer perception (MLP) for accurate prediction. To further reduce computing and memory resources, we introduce a binary quantization mechanism, formulating a novel lightweight Transformer model called variant binary Transformer for FER (VaBTFER). We conduct extensive experiments on several commonly used facial expression datasets, and the results attest to the effectiveness of our methods.

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry

Reference76 articles.

1. Deep facial expression recognition: A survey;Li;IEEE Trans. Affect. Comput.,2020

2. A survey on human face expression recognition techniques;Revina;J. King Saud-Univ.-Comput. Inf. Sci.,2021

3. Multi-semantic discriminative feature learning for sign gesture recognition using hybrid deep neural architecture;Rajalakshmi;IEEE Access,2023

4. Motion stimulation for compositional action recognition;Ma;IEEE Trans. Circuits Syst. Video Technol.,2023

5. Recurrent thrifty attention network for remote sensing scene recognition;Fu;IEEE Trans. Geosci. Remote Sens.,2021

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3