Text Summarization in the Shona Language using Natural Language Processing-Reference-Cited by-同舟云学术

Text Summarization in the Shona Language using Natural Language Processing

Published:2024-08-14 Issue: Volume: Page:2870-2873
ISSN:2456-2165
Container-title:International Journal of Innovative Science and Research Technology (IJISRT)
language:en
Short-container-title:International Journal of Innovative Science and Research Technology (IJISRT)

Author:

Sithabisiwe Manokore Anita,Gondo Monica

Abstract

The rise of digital information in many languages, including Shona, highlights the significance of developing effective text summarizing techniques to promote information accessibility and usability. This work fills a major gap in the natural language processing (NLP) for the Shona language, which is widely spoken in Zimbabwe and its surrounding areas but has received little attention. The lack of pre-trained language models specifically designed for Shona, the intricacy of Shona's morphology, and the scarcity of annotated datasets provide the main obstacles to Shona text summarization.[1] The goal of this research is to create and modify contemporary machine learning methods for efficient Shona text summarizing in order to address these issues. By gathering and analyzing texts from a variety of sources, such as news stories, scholarly papers, and social media, we produced large annotated corpora. These datasets were utilized to fine-tune existing NLP models, such as Transformer-based architectures, ensuring they account for Shona’s specific language traits. To address the morphological and syntactic complexities of Shona, our solution combines statistical and rule-based techniques. When compared to baseline methods, the results show a significant improvement in the relevancy and accuracy of Shona text summaries. In addition to serving as a starting point for further NLP research in underrepresented languages, the generated models help Shona-speaking people in the areas of business, education, and media. By encouraging inclusivity and linguistic variety, showcasing the possibility for cross- lingual breakthroughs, and emphasizing the ethical development of technology, this research adds to the larger area of NLP.

Publisher

International Journal of Innovative Science and Research Technology

Reference33 articles.

1. Vienna Li, Srinita Sridharan, Sandeep Sethuraman, Georgios Avdis. “Predicting Recidivism With Machine Learning An Analysis of Risk Factors and Proposal of Preventions”, Journal of Student Research, 2023

2. Amy J. C. Trappey; Charles V. Trappey; Jheng-Long Wu; W. C. Wang; "Intelligent Compilation of Patent Summaries Using Machine Learning and Natural Language Processing Techniques", ADV. ENG. INFORMATICS, 2020.

3. Liuqing Li; Jack Geissinger; William A. Ingram; Edward A. Fox; "Teaching Natural Language Processing Through Big Data Text Summarization with Problem-Based Learning", DATA AND INFORMATION MANAGEMENT, 2020.

4. Ovishake Sen; Mohtasim Fuad; Md. Nazrul Islam; Jakaria Rabbi; Mehedi Masud; Md. Kamrul Hasan; Md. Abdul Awal; Awal Ahmed Fime; Md. Tahmid Hasan Fuad; Delowar Sikder; Md. Akil Raihan Iftee; "Bangla Natural Language Processing: A Comprehensive Analysis of Classical, Machine Learning, and Deep Learning-Based Methods", IEEE ACCESS, 2021. (IF: 3)

5. III Robert E. Wray; James R. Kirk; John E. Laird; "Language Models As A Knowledge Source for Cognitive Agents", ARXIV-CS.AI, 2021.

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Comparative Evaluation of Endo Ice® Refrigerant Spray, Endo Frost® Refrigerant Spray and Topical Anesthetic Agent Precaine B® on the Pain Perception in Children Prior to Administration of Local Anaesthesia – An Invivo Study;International Journal of Innovative Science and Research Technology (IJISRT);2024-08-27

2. Resource Dependence: Evidence from an FMCG Industry;International Journal of Innovative Science and Research Technology (IJISRT);2024-08-22