1. Ucf101: A dataset of 101 human actions classes from videos in the wild;soomro,2012
2. Two-stream convolutional networks for action recognition in videos;simonyan;Advances in neural information processing systems,2014
3. ZEST: Zero-shot Learning from Text Descriptions using Textual Similarity and Visual Summarization
4. Modeling Temporal Structure of Decomposable Motion Segments for Activity Classification
5. WordNet