A Survey Study on Automatic Subtitle Synchronization and Positioning System for Deaf and Hearing Impaired People
-
Published:2022-11-17
Issue:
Volume:
Page:423-428
-
ISSN:2581-9429
-
Container-title:International Journal of Advanced Research in Science, Communication and Technology
-
language:en
-
Short-container-title:IJARSCT
Author:
Santosh S Kale 1, Shruti Dhanak 1, Paras Chavan 1, Jay Kakade 1, Prasad Humbe 1
Affiliation:
1. NBN Sinhgad School of Engineering, Pune, Maharashtra, India
Abstract
In this study, we provide a subtitle synchronisation and placement system intended to improve deaf and hearing-impaired individuals' access to multimedia content. The paper's main contributions are a novel synchronisation algorithm that can reliably align the closed caption with the audio transcript without any human involvement and a timestamp refinement technique that can modify the duration of the subtitle segments in accordance with audiovisual recommendations. Regardless of the kind of video, the experimental evaluation of the strategy on a sizable dataset of 30 films pulled from the French national television verifies the method with average accuracy scores above 90%. The success of our strategy is demonstrated by the subjective assessment of the suggested subtitle synchronization and location system, carried out with real hearing challenged persons.
Publisher
Naksh Solutions
Reference20 articles.
1. A. Katsamanis, M. P. Black, P. G. Georgiou, L. Goldstein, and S. Narayanan, ‘‘SailAlign: Robust long speech-text alignment,’’ in Proc. Workshop New Tools Methods Very-Large Scale Phonetics Res., Philadelphia, PA, USA, Jan. 2011, pp. 1–4. 2. X. Zhou, C. Yao, H. Wen, Y. Wang, S. Zhou, W. He, and J. Liang, ‘‘EAST: An efficient and accurate scene text detector,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Honolulu, HI, USA, Jul. 2017, pp. 2642–2651. 3. P. J. Moreno, C. Joerg, J.-M. Van Thong, and O. Glickman, ‘‘A recursive algorithm for the forced alignment of very long audio segments,’’ in Proc. Int. Conf. Spoken Lang. Process, Dec. 1998, pp. 2711–2714. 4. M. H. Davel, C. V. Heerden, N. Kleynhans, and E. Barnard, ‘‘Efficient harvesting of Internet audio for resource-scarce ASR,’’ in Proc. Interspeech, Aug. 2011, pp. 3154–3157. 5. N. Braunschweiler, M. J. F. Gales, and S. Buchholz, ‘‘Lightly supervised recognition for automatic alignment of large coherent speech recordings,’’ in Proc. Interspeech, Sep. 2010, pp. 2222–2225.
|
|