Affiliation:
1. National Chi Nan University, Taiwan, R.O.C
Abstract
This article investigates the correlations between multimedia objects (particularly speech and text) involved in language lectures in order to design an effective presentation mechanism for web-based learning. The cross-media correlations are classified into implicit relations (retrieved by computing) and explicit relations (recorded during the preprocessing stage). The implicit temporal correlation between speech and text is primarily to help to negotiate supplementary lecture navigations like tele-pointer movement, lips-sync movement, and content scrolling. We propose a speech-text alignment framework, using an iterative algorithm based on local alignment, to probe many-to-one temporal correlations, and not the one-to-one only. The proposed framework is a more practical method for analyzing general language lectures, and the algorithm's time complexity conforms to the best-possible computation cost,
O(nm)
, without introducing additional computation. In addition, we have shown the feasibility of creating vivid presentations by exploiting implicit relations and artificially simulating some explicit media. To facilitate the navigation of integrated multimedia documents, we develop several visualization techniques for describing media correlations, including guidelines for speech-text correlations, visible-automatic scrolling, and levels of detail of timeline, to provide intuitive and easy-to-use random access mechanisms. We evaluated the performance of the analysis method and human perceptions of the synchronized presentation. The overall performance of the analysis method is that about 99.5% of the words analyzed are of a temporal error within 0.5 sec and the subjective evaluation result shows that the synchronized presentation is highly acceptable to human beings.
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Networks and Communications,Hardware and Architecture
Reference30 articles.
1. Classroom 2000: An experiment with the instrumentation of a living educational environment
2. Video Rewrite
3. Semantic context detection based on hierarchical audio models
4. Chu W. T. 2001. Exploring computed synchronization and its application for navigated hypermedia documents. Masters thesis. Chu W. T. 2001. Exploring computed synchronization and its application for navigated hypermedia documents. Masters thesis.
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. RCE-HIL;ACM Transactions on Multimedia Computing, Communications, and Applications;2020-04-02
2. CM-GANs;ACM Transactions on Multimedia Computing, Communications, and Applications;2019-02-25