Abstract
In this study, a novel prediction method for predicting important scenes in baseball videos using a time-lag aware latent variable model (Tl-LVM) is proposed. Tl-LVM adopts a multimodal variational autoencoder using tweets and videos as the latent variable model. It calculates the latent features from these tweets and videos and predicts important scenes using these latent features. Since time lags exist between posted tweets and events, Tl-LVM introduces the loss considering time lags by correlating the feature into the loss function of the multimodal variational autoencoder. Furthermore, Tl-LVM can train the encoder, decoder, and important scene predictor, simultaneously, using this loss function. This is the novelty of Tl-LVM, and this work is the first end-to-end prediction model of important scenes that considers time lags to the best of our knowledge. It is the contribution of Tl-LVM to realize high-quality prediction using latent features that consider time lags between tweets and multiple corresponding previous events. Experimental results using actual tweets and baseball videos show the effectiveness of Tl-LVM.
Subject
Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry