Author:
Rosid Mochamad Alfan,Fitrani Arif Senja,Astutik Ika Ratna Indra,Mulloh Nasrudin Iqrok,Gozali Haris Ahmad
Abstract
Abstract
In the text mining there are stages that must be passed namely the text preprocessing stage. Text preprocessing is the stage to do the data selection process in each document, including case folding, tokenizing, filtering, and stemming. The results of the preprocessing process can affect the accuracy of document classification. In documents Bahasa Indonesia, there are still often over-stemming and under-stemming, so improvements are needed in the stemming process. In this study, it is proposed to use sastrawi libraries to improve the results of previous studies that are still not optimal in the results of preprocessing, especially in the filtering and stemming process. From the results of the study, the sastrawi library is able to reduce over stemming and under stemming and a faster processing time compared to using a Tala stemmer.
Reference12 articles.
1. Improved feature selection approach TFIDF in text mining;Jing;Proc. 2002 Int. Conf. Mach. Learn. Cybern.,2002
2. Modified Porter’s Algorithm For Pre-Processing Academic Feedback Data;Durairaj;Int. J. Pure Appl. Math.,2018
3. Evaluation of stemming techniques for text classification;Rani;J. Comput.,2015
Cited by
29 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献