TIG-DETR: Enhancing Texture Preservation and Information Interaction for Target Detection
-
Published:2023-07-10
Issue:14
Volume:13
Page:8037
-
ISSN:2076-3417
-
Container-title:Applied Sciences
-
language:en
-
Short-container-title:Applied Sciences
Author:
Liu Zhiyong1, Wang Kehan1, Li Changming2, Wang Yixuan1, Luo Guoqian1
Affiliation:
1. College of Information Science and Technology, Northeast Normal University, Changchun 130024, China 2. Engineering Technology Development Center, Changchun Guanghua University, Changchun 130033, China
Abstract
FPN (Feature Pyramid Network) and transformer-based target detectors are commonly employed in target detection tasks. However, these approaches suffer from design flaws that restrict their performance. To overcome these limitations, we proposed TIG-DETR (Texturized Instance Guidance DETR), a novel target detection model. TIG-DETR comprises a backbone network, TE-FPN (Texture-Enhanced FPN), and an enhanced DETR detector. TE-FPN addresses the issue of texture information loss in FPN by utilizing a bottom-up architecture, Lightweight Feature-wise Attention, and Feature-wise Attention. These components effectively compensate for texture information loss, mitigate the confounding effect of cross-scale fusion, and enhance the final output features. Additionally, we introduced the Instance Based Advanced Guidance Module in the DETR-based detector to tackle the weak detection of larger objects caused by the limitations of window interactions in Shifted Window-based Self-Attention. By incorporating TE-FPN instead of FPN in Faster RCNN and employing ResNet-50 as the backbone network, we observed an improvement of 1.9 AP in average accuracy. By introducing the Instance-Based Advanced Guidance Module, the average accuracy of the DETR-based target detector has been improved by 0.4 AP. TIG-DETR achieves an impressive average accuracy of 44.1% with ResNet-50 as the backbone network.
Funder
Jilin Provincial Science and Technology Department
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Reference48 articles.
1. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., and Polosukhin, I. (2017, January 4–9). Attention is all you need. Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17), Long Beach, CA, USA. 2. Efficient transformers: A survey;Tay;ACM Comput. Surv.,2022 3. Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017, January 21–26). Feature pyramid networks for object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA. 4. Zhu, X., Su, W., Lu, L., Li, B., Wang, X., and Dai, J. (2020). Deformable detr: Deformable transformers for end-to-end object detection. arXiv. 5. Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., and Zagoruyko, S. (2020). Computer Vision—ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020, Proceedings, Part I, Springer International Publishing.
|
|