Sample Less, Learn More: Efficient Action Recognition via Frame Feature Restoration-Reference-Cited by-同舟云学术

Sample Less, Learn More: Efficient Action Recognition via Frame Feature Restoration

Published:2023-10-26 Issue: Volume: Page:
ISSN:
Container-title:Proceedings of the 31st ACM International Conference on Multimedia
language:
Short-container-title:

Author:

Cheng Harry¹^ORCID,Guo Yangyang²^ORCID,Nie Liqiang³^ORCID,Cheng Zhiyong⁴^ORCID,Kankanhalli Mohan²^ORCID

Affiliation:

1. Shandong University, Qingdao, China

2. National University of Singapore, Singapore, Singapore

3. Harbin Institute of Technology, Shenzhen, China

4. Qilu University of Technology (Shandong Academy of Sciences), Jinan, China

Funder

Shandong Project towards the Integration of Education and Industry

National Natural Science Foundation of China

Key R&D Program of Shandong (Major scientific and technological innovation projects)

Publisher

ACM

Link

https://dl.acm.org/doi/pdf/10.1145/3581783.3611696

Reference57 articles.

1. Anurag Arnab , Mostafa Dehghani , Georg Heigold , Chen Sun , Mario Lucic , and Cordelia Schmid . 2021 . ViViT: A Video Vision Transformer. In International Conference on Computer Vision. IEEE, 6816--6826 . Anurag Arnab, Mostafa Dehghani, Georg Heigold, Chen Sun, Mario Lucic, and Cordelia Schmid. 2021. ViViT: A Video Vision Transformer. In International Conference on Computer Vision. IEEE, 6816--6826.

2. Gedas Bertasius , Heng Wang , and Lorenzo Torresani . 2021 . Is Space-Time Attention All You Need for Video Understanding? . In International Conference on Machine Learning. PMLR, 813--824 . Gedas Bertasius, Heng Wang, and Lorenzo Torresani. 2021. Is Space-Time Attention All You Need for Video Understanding?. In International Conference on Machine Learning. PMLR, 813--824.

3. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

4. Audio-driven Talking Video Frame Restoration

5. Mostafa Dehghani Josip Djolonga Basil Mustafa Piotr Padlewski Jonathan Heek Justin Gilmer Andreas Steiner Mathilde Caron Robert Geirhos Ibrahim Alabdulmohsin Rodolphe Jenatton Lucas Beyer Michael Tschannen Anurag Arnab Xiao Wang Carlos Riquelme Matthias Minderer Joan Puigcerver Utku Evci Manoj Kumar Sjoerd van Steenkiste Gamaleldin F. Elsayed Aravindh Mahendran Fisher Yu Avital Oliver Fantine Huot Jasmijn Bastings Mark Patrick Collier Alexey A. Gritsenko Vighnesh Birodkar Cristina Vasconcelos Yi Tay Thomas Mensink Alexander Kolesnikov Filip Pavetic Dustin Tran Thomas Kipf Mario Lucic Xiaohua Zhai Daniel Keysers Jeremiah Harmsen and Neil Houlsby. 2023. Scaling Vision Transformers to 22 Billion Parameters. CoRR Vol. abs/2302.05442 (2023) 1--21. Mostafa Dehghani Josip Djolonga Basil Mustafa Piotr Padlewski Jonathan Heek Justin Gilmer Andreas Steiner Mathilde Caron Robert Geirhos Ibrahim Alabdulmohsin Rodolphe Jenatton Lucas Beyer Michael Tschannen Anurag Arnab Xiao Wang Carlos Riquelme Matthias Minderer Joan Puigcerver Utku Evci Manoj Kumar Sjoerd van Steenkiste Gamaleldin F. Elsayed Aravindh Mahendran Fisher Yu Avital Oliver Fantine Huot Jasmijn Bastings Mark Patrick Collier Alexey A. Gritsenko Vighnesh Birodkar Cristina Vasconcelos Yi Tay Thomas Mensink Alexander Kolesnikov Filip Pavetic Dustin Tran Thomas Kipf Mario Lucic Xiaohua Zhai Daniel Keysers Jeremiah Harmsen and Neil Houlsby. 2023. Scaling Vision Transformers to 22 Billion Parameters. CoRR Vol. abs/2302.05442 (2023) 1--21.

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. HADE: Exploiting Human Action Recognition Through Fine-Tuned Deep Learning Methods;IEEE Access;2024

2. Intelligent Assistance Method for Preschool Education Dance Curriculum Based on Deep Learning;Applied Mathematics and Nonlinear Sciences;2024-01-01

3. Perceptual Feature Integration for Sports Dancing Action Scenery Detection and Optimization;IEEE Access;2024

4. Voice-Face Homogeneity Tells Deepfake;ACM Transactions on Multimedia Computing, Communications, and Applications;2023-11-10