1. Lefter, I., Burghouts, G., Rothkrantz, L.: Learning the fusion of audio and video aggression assessment by meta-information from human annotations. In: International Conference on Information Fusion, FUSION (in press, 2012)
2. Lefter, I., Burghouts, G., Rothkrantz, L.: Automatic audio-visual fusion for aggression detection using meta-information. In: IEEE Conference on Advanced Video and Signal Based Surveillance, AVSS (in press, 2012)
3. Atrey, P.K., Hossain, M.A., Saddik, A.E., Kankanhalli, M.S.: Multimodal fusion for multimedia analysis: A survey. Springer Multimedia Systems Journal, 345–379 (2010)
4. Schuller, B., Rigoll, G., Lang, M.: Speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine-belief network architecture. In: Proceedings. IEEE International Conference on.Acoustics, Speech, and Signal Processing, ICASSP 2004, vol. 1, pp. I–577–I–580 (2004)
5. Eyben, F., Wöllmer, M., Valstar, M., Gunes, H., Schuller, B., Pantic, M.: String-based audiovisual fusion of behavioural events for the assessment of dimensional affect. In: 2011 IEEE International Conference on Automatic Face Gesture Recognition and Workshops, FG 2011, pp. 322–329 (2011)