1. Sanjeev Arora, Yingyu Liang, and Tengyu Ma. 2017. A simple but tough-to-beat baseline for sentence embeddings. In ICLR. 1–16.
2. Multimodal Machine Learning: A Survey and Taxonomy
3. Heterogeneous hierarchical feature aggregation network for personalized micro-video recommendation;Cai Desheng;TMM,2021
4. Jingyuan Chen, Hanwang Zhang, Xiangnan He, Liqiang Nie, Wei Liu, and Tat-Seng Chua. 2017. Attentive collaborative filtering: Multimedia recommendation with item-and component-level attention. In SIGIR. 335–344.
5. Emotion recognition in human-computer interaction;Cowie Roddy;SPM,2001