Marathi Text Analysis using Unsupervised Learning and Word Cloud-Reference-Cited by-同舟云学术

Marathi Text Analysis using Unsupervised Learning and Word Cloud

Published:2020-02-28 Issue:3 Volume:9 Page:338-343
ISSN:2249-8958
Container-title:International Journal of Engineering and Advanced Technology
language:
Short-container-title:IJEAT

Author:

Bafna Prafulla B., ,Saini Jatinderkumar R.,

Abstract

Managing a large number of textual documents is a critical and significant task and supports many applications ranging from information retrieval to clustering search engine results. Marathi is one of the oldest of the regional languages in the Indo-Aryan language family, dating from about AD 1000. Abundance of Marathi literature has generated a big corpus and need of summarization of information. The objective of this study is to overcome the scalability problem while managing the documents and summarize the Marathi corpus by extracting tokens. The work is better in terms of scalability and supports the consistent quality of cluster for incremental data set. Most of the past and contemporary research works have targeted English corpus document management. Marathi corpus has been mostly exploited by the researchers for exploring stemming, single-document summarization and classifier design on Marathi corpus. Implementing unsupervised learning on the Marathi corpus for summarization of multiple documents through Word Cloud is still an untouched area. Technically speaking, the current work is an application of TF-IDF, cosine-based document similarity measures and cluster dendrograms, in addition to various other Natural Language Processing (NLP) activities. Entropy and precision are used to evaluate the experiments carried on different datasets and results prove the robustness of the proposed approach for Marathi Corpus.

Publisher

Blue Eyes Intelligence Engineering and Sciences Engineering and Sciences Publication - BEIESP

Subject

Computer Science Applications,General Engineering,Environmental Engineering

Link

https://www.ijeat.org/wp-content/uploads/papers/v9i3/C4727029320.pdf

Cited by 7 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Unveiling the Familiar: Exploring Makoto Shinkai’s Anime Art;Indian Journal of Social Science and Literature;2023-12-30

2. Topic Identification and Prediction Using Sanskrit Hysynset;Pervasive Computing and Social Networking;2022-09-02

3. Experimental Evaluation and Approach of Enhancement in Generation of Automatic Unsupervised Extractive Text Summarization of Marathi Text By Using Machine Learning Algorithm;Journal of Machine and Computing;2022-01-05

4. RoMaPla: Using t-Test for Evaluating Robustness of Marathi Plagiarism;Evolution in Computational Intelligence;2022

5. MaTop: An Evaluative Topic Model for Marathi;Advances in Intelligent Systems and Computing;2022