Research on Chinese Audio and Text Alignment Algorithm Based on AIC-FCM and Doc2Vec

Author:

Chen Keliang1,Huang Jianming1,Cui Yansong1,Ren Weizheng1

Affiliation:

1. School of Electronic Engineering, Beijing University of Posts and Telecommunications, Haidian, Beijing, China

Abstract

‘‘Audiobook” is a multimedia-based reading technology that has emerged in recent years. Realizing the alignment of e-book text and book audio is the most important part of its processing. This article describes an audio and text alignment algorithm using deep learning and neural network technology to improve the efficiency and quality of audiobook production. The algorithm first uses dual-threshold endpoint detection technology to segment long audio into short audio with sentence dimensions and recognizes it as short text. The threshold is calculated by AIC-FCM optimized based on simulated annealing genetic algorithm. Then the algorithm uses Doc2vec optimized by the threshold prediction method based on the average length of the short text to calculate the text similarity. Finally, proofread and output the text sequence and audio segment aligned in the time dimension to meet the needs of audiobook production. Experiments show that compared to traditional audio and text alignment algorithms, the proposed algorithm is closer to the ideal segmentation result in long audio segmentation, and the alignment effect is basically the same as Doc2vec and the time complexity is reduced by about 35%.

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science

Reference49 articles.

1. PMRSS: Privacy-preserving medical record searching scheme for intelligent diagnosis in IoT healthcare;Sun Y.;IEEE Transactions on Industrial Informatics,2021

2. Z. Guo Y. Shen A. K. Bashir M. Imran and K. Yu. 2020. Robust spammer detection using collaborative neural network in internet of thing applications. IEEE Internet of Things Journal 8 12 (2020) 9549–9558.

3. Non-linear MIMO for industrial internet of things in cyber-physical systems;Gong Y.;IEEE Transactions on Industrial Informatics,2020

4. High-performance isolation computing technology for smart IoT healthcare in cloud environments;Zhang Y.;IEEE Internet of Things Journal,2021

5. L. Tan H. Xiao K. Yu et al. 2021. A blockchain-empowered crowdsourcing system for 5G-enabled smart cities [J]. Computer Standards & Interfaces 76 (2021) 103517.

Cited by 2 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. CNN-based speech segments endpoints detection framework using short-time signal energy features;International Journal of Information Technology;2023-09-10

2. HKG: A Novel Approach for Low Resource Indic Languages to Automatic Knowledge Graph Construction;ACM Transactions on Asian and Low-Resource Language Information Processing;2023-08-02

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3