Video Face Tracking for IoT Big Data using Improved Swin Transformer based CSA Model-Reference-Cited by-同舟云学术

Video Face Tracking for IoT Big Data using Improved Swin Transformer based CSA Model

Published:2024-04-05 Issue: Volume: Page:308-316
ISSN:2788-7669
Container-title:Journal of Machine and Computing
language:en
Short-container-title:JMC

Author:

K Anbumani¹,Anitha Cuddapah²,S V Achuta Rao³,K Praveen Kumar⁴,Ramasamy Meganathan⁵,R Mahaveerakannan⁶

Affiliation:

1. Department of Electronics and Instrumentation Engineering, Sri Sairam Engineering College, Chennai, India.

2. Department of Computer Science and Engineering, School of Computing, Mohan Babu University, (Erstwhile Sree Vidyanikethan Engineering College), Andhra Pradesh, India.

3. Data Science Research Laboratories, Sree Dattha Institute of Engineering & Science, Sheriguda, Hyderabad, Telangana, India.

4. Department of Information Technology, Kakatiya Institute of Technology and Science, Warangal, India.

5. Department of Computing, De Montfort University Kazakhstan, Al-Farabi Ave, Republic of Kazakhstan.

6. Department of Computer Science and Engineering, Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences, Chennai, India.

Abstract

Even though Convolutional Neural Networks (CNNs) have greatly improved face-related algorithms, it is still difficult to keep both accuracy and efficiency in real-world applications. The most cutting-edge approaches use deeper networks to improve performance, but the increased computing complexity and number of parameters make them impractical for usage in mobile applications. To tackle these issues, this article presents a model for object detection that combines Deeplabv3+ with Swin transformer, which incorporates GLTB and Swin-Conv-Dspp (SCD). To start with, in order to lessen the impact of the hole phenomena and the loss of fine-grained data, we employ the SCD component, which is capable of efficiently extracting feature information from objects at various sizes. Secondly, in order to properly address the issue of challenging object recognition due to occlusion, the study builds a GLTB with a spatial pyramid pooling shuffle module. This module allows for the extraction of important detail information from the few noticeable pixels of the blocked objects. Crocodile search algorithm (CSA) enhances classification accuracy by properly selecting the model's fine-tuning. On a benchmark dataset known as WFLW, the study experimentally validates the suggested model. Compared to other light models, the experimental findings show that it delivers higher performance with significantly fewer parameters and reduced computing complexity.

Publisher

Anapub Publications

Link

https://anapub.co.ke/journals/jmc/jmc_pdf/2024/jmc_volume_4-issue_2/JMC202404029.pdf

Reference23 articles.

1. X. Liu., “Collaborative Edge Computing With FPGA-Based CNN Accelerators for Energy-Efficient and Time-Aware Face Tracking System,” IEEE Transactions on Computational Social Systems, vol. 9, no. 1, pp. 252–266, Feb. 2022, doi: 10.1109/tcss.2021.3059318.

2. M. Kumar, K. S. Raju, D. Kumar, N. Goyal, S. Verma, and A. Singh, “An efficient framework using visual recognition for IoT based smart city surveillance,” Multimedia Tools and Applications, vol. 80, no. 20, pp. 31277–31295, Jan. 2021, doi: 10.1007/s11042-020-10471-x.

3. S. Jha, C. Seo, E. Yang, and G. P. Joshi, “Real time object detection and trackingsystem for video surveillance system,” Multimedia Tools and Applications, vol. 80, no. 3, pp. 3981–3996, Sep. 2020, doi: 10.1007/s11042-020-09749-x.

4. A. K. Biswal, D. Singh, B. K. Pattanayak, D. Samanta, and M.-H. Yang, “IoT-Based Smart Alert System for Drowsy Driver Detection,” Wireless Communications and Mobile Computing, vol. 2021, pp. 1–13, Mar. 2021, doi: 10.1155/2021/6627217.

5. S. Meivel et al., “Mask Detection and Social Distance Identification Using Internet of Things and Faster R-CNN Algorithm,” Computational Intelligence and Neuroscience, vol. 2022, pp. 1–13, Feb. 2022, doi: 10.1155/2022/2103975.