1. Unsupervised Procedure Learning via Joint Dynamic Summarization
2. Human motion diffusion model;tevet;International Conference on Learning Representations,0
3. Self-supervised Multi-task Procedure Learning from Instructional Videos
4. Comprehensive instructional video analysis: The COIN dataset and performance evaluation;tang;TPAMI,2020
5. VIOLET: End-to-end video-language transformers with masked visual-token modeling;fu;ArXiv Preprint,2021