Text Analytics in Bulgarian: An Overview and Future Directions

Author:

Hristova Gloria1

Affiliation:

1. Department of Statistics and Econometrics, Faculty of Economics and Business Administration , Sofia University “St. Kliment Ohridski” , 1113 Sofia , Bulgaria

Abstract

Abstract Text analytics is becoming an integral part of modern business and economic research and analysis. However, the extent to which its application is possible and accessible varies for different languages. The main goal of this paper is to outline fundamental research on text analytics applied on data in Bulgarian. A review of key research articles in two main directions is provided – development of language resources for Bulgarian and experimenting with Bulgarian text data in practical applications. By summarizing the results of a large literature review, we draw conclusions about the degree of development of the field, the availability of language resources for the Bulgarian language and the extent to which text analytics has been applied in practical problems. Future directions for research are outlined. To the best of the author’s knowledge, this is the first study providing a comprehensive overview of progress in the field of text analytics in Bulgarian.

Publisher

Walter de Gruyter GmbH

Subject

General Computer Science

Reference69 articles.

1. 1. Arkhipov, M., M. Trofimova, Y. Kuratov, A. Sorokin. Tuning Multilingual Transformers for Named Entity Recognition on Slavic Languages. – In: Proc. of 7th Workshop on Balto-Slavic Natural Language Processing (BSNLP’19), August 2019, pp. 89-93.10.18653/v1/W19-3712

2. 2. 451 Research. Addressing the Role of Unstructured Data with Object Storage. 2018. https://whitepapers.theregister.com/paper/view/7081/451-research-addressing-the-changing-role-of-unstructured-data-with-object-storage?td=s-uu

3. 3. Boytcheva, S. Assignment of ICD-10 Codes to Diagnoses in Hospital Patient Records in Bulgarian. – In: Proc. of International Workshop “Extraction of Structured Information from Texts in the BioMedical Domain” (ESIT-BioMed’10), Associated to the 18th Int. Conference on Conceptual Structures (ICCS’10), Kuching, Sarawak, Malaysia, July 2010, pp. 56-66.

4. 4. Boytcheva, S. Automatic Matching of ICD-10 Codes to Diagnoses in Discharge Letters. – In: Proc. of 2nd Workshop on Biomedical Natural Language Processing, September 2011, pp. 11-18.

5. 5. Boytcheva, S. Structured Information Extraction from Medical Texts in Bulgarian. – Cybernetics and Information Technologies, Vol. 12, 2012, No 4, pp. 52-65.10.2478/cait-2012-0030

Cited by 6 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Geo-spatial crime density attribution using optimized machine learning algorithms;International Journal of Information Technology;2023-02

2. Media Coverage and Public Perception of Distance Learning During the COVID-19 Pandemic: A Topic Modeling Approach Based on BERTopic;2022 IEEE International Conference on Big Data (Big Data);2022-12-17

3. Converting Numeral Text in Bulgarian into Digit Number Using GATE;Mathematics and Informatics;2022-06-30

4. Data mining of public opinion: An overview;“TOPICAL ISSUES OF THERMOPHYSICS, ENERGETICS AND HYDROGASDYNAMICS IN THE ARCTIC CONDITIONS”: Dedicated to the 85th Birthday Anniversary of Professor E. A. Bondarev;2022

5. Design of ML-based AI system for mining public opinion on e-government services in Bulgaria;“TOPICAL ISSUES OF THERMOPHYSICS, ENERGETICS AND HYDROGASDYNAMICS IN THE ARCTIC CONDITIONS”: Dedicated to the 85th Birthday Anniversary of Professor E. A. Bondarev;2022

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3