Affiliation:
1. 1 International College for Chinese Studies , Nanjing Normal University , Nanjing , Jiangsu , , China .
2. 2 Jiangsu College of Finance & Accounting , Lianyungang , Jiangsu , , China .
Abstract
Abstract
In this paper, an improved semantic learning model under multimodal data fusion is proposed to model both image attention features and problem attention features using a network architecture of collaborative attention learning, which can effectively reduce irrelevant feature interference and extract more distinguishable features for image and problem representations. In addition, to address the high dimensionality as well as complex computational problems in the multimodal data fusion process, multimodal bilinear decomposition methods are utilized in order to achieve a more effective fusion of visual features in images and text features in questions to capture more complex interactions between multiple models. Compared with TF-IDF and TextRank, the accuracy rate of the model in this paper is 9.7% and 12.1% higher than them, respectively. The F=16.15, p<.05, for the first group of scores and the second group of scores for international students in Chinese language classes playing listening materials in a combination of audio, video, and Chinese characters. The F=8.527, p<.05, for the second group of scores and the third group of scores.
Subject
Applied Mathematics,Engineering (miscellaneous),Modeling and Simulation,General Computer Science