A method for automatic analysis Table of Contents in Chinese books

Author:

Chen Jing,Lu Quan

Abstract

Purpose – The purpose of this paper is to propose a novel method to analyze Table of Contents (TOC) in Chinese books automatically based on the hierarchy organization rules which gained by investigation. Design/methodology/approach – This paper analyzed the main literature in this field first, then hierarchy organization rules of Chinese book TOC were generated and the method parsing TOC automatically based on these rules was proposed. A prototype system implementing the method was also developed. The method was evaluated through processing a corpus on the prototype system, and the results were checked with calculation of precision and recall. Findings – The experiment result illustrated the superiority (extensive application, recall is 95.34 percent and precision is 94.44 percent) of the method. Practical implications – The result can help Chinese libraries deal with electronic texts from four aspects. First, it can be used to complement or enhance current digitization and optical character recognition methods and cut the financial and labor cost of Chinese libraries. Second, it can help libraries to keep information on indexing words as well as chapters, sections and subsections in Chinese book databases, which ensures easy retrieval and extract any intended portion as demanded by user. Third, it helps to enrich the services and then enhances the user experiences in Chinese libraries. Fourth, it improves the specification and policy of digitalizing Chinese books. Originality/value – The paper provided insight into the hierarchy organization of TOCs in Chinese books, the method based on the rules has extensive application than other methods. This method for Chinese book TOC automatic analysis is also as reference for English book TOC automatic analysis.

Publisher

Emerald

Subject

Library and Information Sciences,Information Systems

Reference23 articles.

Cited by 3 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. The Digital Content Formation Labor Costs for Electronic Libraries and Examples of the Formation of Virtual Exhibitions;E-Business and Telecommunications;2023

2. Some Estimates of Labor Contribution for Creating Digital Libraries;Proceedings of the 18th International Conference on e-Business;2021

3. Assessment of Efforts for Content Creation for the Common Digital Space of Scientific Knowledge;Proceedings of the 5th International Conference on Computer-Human Interaction Research and Applications;2021

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3