Affiliation:
1. State Key Laboratory of Virtual Reality Technology and System, Beihang University, Beijing, China
2. Tencent, Shenzhen, China
Abstract
Dance challenges are going viral in video communities like TikTok nowadays. Once a challenge becomes popular, thousands of short-form videos will be uploaded within a couple of days. Therefore, virality prediction from dance challenges is of great commercial value and has a wide range of applications, such as smart recommendation and popularity promotion. In this article, a novel multi-modal framework that integrates skeletal, holistic appearance, facial and scenic cues is proposed for comprehensive dance virality prediction. To model body movements, we propose a pyramidal skeleton graph convolutional network (PSGCN) that hierarchically refines spatio-temporal skeleton graphs. Meanwhile, we introduce a relational temporal convolutional network (RTCN) to exploit appearance dynamics with non-local temporal relations. An attentive fusion approach is finally proposed to adaptively aggregate predictions from different modalities. To validate our method, we introduce a large-scale viral dance video (VDV) dataset, which contains over 4,000 dance clips of eight viral dance challenges. Extensive experiments on the VDV dataset well demonstrate the effectiveness of our approach. Furthermore, we show that short video applications such as multi-dimensional recommendation and action feedback can be derived from our model.
Funder
National Natural Science Foundation of China
Foundation for Innovative Research Groups through the National Natural Science Foundation of China
CCF-Tencent Rhino-Bird Research Fund
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Networks and Communications,Hardware and Architecture
Reference67 articles.
1. 2D Human Pose Estimation: New Benchmark and State of the Art Analysis
2. Jimmy Lei Ba Jamie Ryan Kiros and Geoffrey E. Hinton. 2016. Layer normalization. arxiv:1607.06450.
3. Shaojie Bai J. Zico Kolter and Vladlen Koltun. 2018. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arxiv:1803.01271.
4. Pay Attention to Virality: Understanding Popularity of Social Media Videos with the Attention Mechanism
5. Understanding Multimodal Popularity Prediction of Social Media Videos With Self-Attention
Cited by
10 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献