Improved PMGAT for Human-Object Interaction Detection through Graph Sampling-based Dynamic Edge Strategy (GraphSADES)

Author:

Zhang Jiali1,Yunos Zuriahati Mohd1,Haron Habibollah1

Affiliation:

1. Universiti Teknologi Malaysia

Abstract

Abstract

One of the challenges in training graph neural networks (GNNs) applied to human-object interaction (HOI) is the computational complexity associated with updating and aggregating the information of all connected nodes in dense graph data, which results in a long training time and poor convergence efficiency. In particular, the parallel multi-head graph attention network (PMGAT), a graph neural network model, has achieved promising results in HOI detection by capturing the interactive associations between keypoints through local feature modules and multi-head graph attention mechanisms. However, to address the challenge of computational complexity, this study proposes a graph sampling-based dynamic edge strategy called GraphSADES to improve the PMGAT. GraphSADES reduces computational complexity by dynamically sampling a subset of edges during the training process while maintaining the precision of the original model. Initially, an object-centered complete graph is constructed, node updates are performed to obtain the initial attention coefficients, and importance coefficients are computed. Subsequently, a dynamic edge sampling strategy is adopted to reduce the computational complexity by randomly selecting a subset of edges for updating and aggregating the information in each training step. Through experimental comparative analysis, GraphSADES-PMGAT maintains the precision of the PMGAT model, and the models are trained using ResNet-50 and ViT-B/16 as backbone networks. On the dataset, HICO-DET, Floating Point Operations (FLOPs) for computational complexity are decreased by 40.12% and 39.89%, and the training time is decreased by 14.20% and 12.02%, respectively, and the convergence efficiency is the earliest to converge after 180 epochs. On the V-COCO dataset, under the same backbone network condition as HICO-DET, FLOPs decreased by 39.81% and 39.56%, training time decreased by 10.26% and 16.91%, respectively, and the convergence efficiency was the earliest to converge after 165 epochs. Specifically, GraphSADES-PMGAT maintains comparable precision while reducing FLOPs, resulting in a shorter training time and improved convergence efficiency compared to the PMGAT model. This work opens up new possibilities for achieving efficient human-object interaction detection.

Publisher

Research Square Platform LLC

Reference45 articles.

1. Zhou, T., Wang, W., Qi, S., Ling, H. & Shen, J. Cascaded human-object interaction recognition. in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition 4263–4272 (2020).

2. Wang, T. et al. Learning human-object interaction detection using interaction points. in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 4116–4125 (2020).

3. Hoi analysis: Integrating and decomposing human-object interaction;Li Y-L;Adv. Neural Inf. Process. Syst.,2020

4. Human-Object Interaction Detection: An Overview;Wang J;IEEE Consum. Electron. Mag.,2023

5. Hand-object interaction: From human demonstrations to robot manipulation;Carfì A;Front. Robot. AI,2021

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3