Attention-Guided Huber Loss for Head Pose Estimation Based on Improved Capsule Network

Author:

Zhong Runhao1,He Li1,Wang Hongwei1,Yuan Liang12,Li Kexin1,Liu Zhening1

Affiliation:

1. School of Mechanical Engineering, Xinjiang University, Urumqi 830046, China

2. School of Information Science and Technology, Beijing University of Chemical Technology, Beijing 100029, China

Abstract

Head pose estimation is an important technology for analyzing human behavior and has been widely researched and applied in areas such as human–computer interaction and fatigue detection. However, traditional head pose estimation networks suffer from the problem of easily losing spatial structure information, particularly in complex scenarios where occlusions and multiple object detections are common, resulting in low accuracy. To address the above issues, we propose a head pose estimation model based on the residual network and capsule network. Firstly, a deep residual network is used to extract features from three stages, capturing spatial structure information at different levels, and a global attention block is employed to enhance the spatial weight of feature extraction. To effectively avoid the loss of spatial structure information, the features are encoded and transmitted to the output using an improved capsule network, which is enhanced in its generalization ability through self-attention routing mechanisms. To enhance the robustness of the model, we optimize Huber loss, which is first used in head pose estimation. Finally, experiments are conducted on three popular public datasets, 300W-LP, AFLW2000, and BIWI. The results demonstrate that the proposed method achieves state-of-the-art results, particularly in scenarios with occlusions.

Funder

National Natural Science Foundation of China

Key R&D Program of Xinjiang Uygur Autonomous Region

Natural Science Foundation of Xinjiang Uygur Autonomous Region

Publisher

MDPI AG

Subject

General Physics and Astronomy

Reference52 articles.

1. A survey on human-aware robot navigation;Moller;Robot. Auton. Syst.,2021

2. Head pose estimation in computer vision: A survey;Trivedi;IEEE Trans. Pattern Anal. Mach. Intell.,2009

3. An improved single shot multibox for video-rate head pose prediction;Jie;IEEE Sens. J.,2020

4. Yining, L., Liang, W., Fang, X., Yibiao, Z., and Lap-Fai, Y. (2018, January 18–22). Synthesizing Personalized Training Programs for Improving Driving Habits via Virtual Reality. Proceedings of the IEEE Conference on Virtual Reality and 3D User Interfaces (VR), Tuebingen/Reutlingen, Germany.

5. Ye, M., Zhang, W., and Cao, P. (2021). Driver fatigue detection based on residual channel attention network and head pose estimation. Appl. Sci., 11.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3