Improving Kurdish Web Mining through Tree Data Structure and Porter’s Stemmer Algorithms-Reference-Cited by-同舟云学术

Improving Kurdish Web Mining through Tree Data Structure and Porter’s Stemmer Algorithms

Published:2018-06-30 Issue:1 Volume:2 Page:48-54
ISSN:2520-7792
Container-title:UKH Journal of Science and Engineering
language:
Short-container-title:UKH J SCI ENG

Author:

Saeed Ari M.^ORCID,Rashid Tarik A.^ORCID,Mustafa Arazo M.^ORCID,Fattah Polla^ORCID,Ismael Birzo^ORCID

Abstract

Stemming is one of the main important preprocessing techniques that can be used to enhance the accuracy of text classification. The key purpose of using the stemming is combining the number of words that have same stem to decrease high dimensionality of feature space. Reducing feature space cause to decline time to construct a model and minimize the memory space. In this paper, a new stemming approach is explored for enhancing Kurdish text classification performance. Tree data structure and Porter’s stemmer algorithms are incorporated for building the proposed approach. The system is assessed through using Support Vector Machine (SVM) and Decision Tree (C4.5) to illustrate the performance of the suggested stemmer after and before applying it. Furthermore, the usefulness of using stop words are considered before and after implementing the suggested approach.

Publisher

University of Kurdistan Hewler

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Gigant-KTTS dataset: Towards building an extensive gigant dataset for Kurdish text-to-speech systems;Data in Brief;2024-08

2. An abstractive text summarization technique using transformer model with self-attention mechanism;Neural Computing and Applications;2023-06-01

3. CKMorph: a comprehensive morphological analyzer for Central Kurdish;International Journal of Digital Humanities;2023-01-30

4. Medical dataset classification for Kurdish short text over social media;Data in Brief;2022-06

5. Empirical evaluation and study of text stemming algorithms;Artificial Intelligence Review;2020-04-15