1. Efficient low-rank multimodal fusion with modality-specific factors;Liu,2018
2. Learning relationships between text, audio, and video via deep canonical correlation for multimodal language analysis;Sun,2020
3. MISA: Modality-invariant and-specific representations for multimodal sentiment analysis;Hazarika,2020
4. Gradient normalization for adaptive loss balancing in deep multitask networks;Chen;Int. Conf. Machine Learn.,2018
5. What makes training multi-modal classification networks hard;Wang,2020