Affiliation:
1. College of Computer Science and Information Systems, Najran University, Najran, Saudi Arabia
Abstract
With the evaluation of the software industry, a huge number of software applications are designing, developing, and uploading to multiple online repositories. To find out the same type of category and resource utilization of applications, researchers must adopt manual working. To reduce their efforts, a solution has been proposed that works in two phases. In first phase, a semantic analysis-based keywords and variables identification process has been proposed. Based on the semantics, designed a dataset having two classes: one represents application type and the other corresponds to application keywords. Afterward, in second phase, input preprocessed dataset to manifold machine learning techniques (Decision Table, Random Forest, OneR, Randomizable Filtered Classifier, Logistic model tree) and compute their performance based on TP Rate, FP Rate, Precision, Recall, F1-Score, MCC, ROC Area, PRC Area, and Accuracy (%). For evaluation purposes, We have used an R language library called latent semantic analysis for creating semantics, and the Weka tool is used for measuring the performance of algorithms. Results show that the random forest depicts the highest accuracy which is 99.3% due to its parametric function evaluation and less misclassification error.
Subject
Artificial Intelligence,General Engineering,Statistics and Probability
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献