Data Quality Assessment and Recommendation of Feature Selection Algorithms: An Ontological Approach
-
Published:2023-04-20
Issue:
Volume:
Page:
-
ISSN:1544-5976
-
Container-title:Journal of Web Engineering
-
language:
-
Short-container-title:JWE
Author:
Nayak Aparna,Božić Bojan,Longo Luca
Abstract
Feature selection plays an important role in machine learning and data mining problems. Identifying the best feature selection algorithm that helps to remove irrelevant and redundant features is a complex task. This research tries to address it by recommending a feature selection algorithm based on dataset meta-features. The main contribution of the work is the use of Semantic Web principles to develop a recommendation model for the feature selection algorithm. As a result, dataset meta-features are modeled in a domain ontology, and a set of Semantic Web rule language (SWRL) predictive rules have been proposed to recommend a feature selection algorithm. The result of this research is a feature selection algorithm recommendation based on the data characteristics and quality (FSDCQ) ontology, which not only helps with recommendations but also finds the data points with data quality violations. An experiment is conducted on the classification datasets from the UCI repository to evaluate the proposed ontology. The usefulness and effectiveness of the proposed method is evaluated by comparing it with the widely used method in the literature for the recommendation. Results show that the ontology-based recommendations are equally good as the widely used recommendation model, which is k-NN, with added benefits.
Publisher
River Publishers
Subject
Computer Networks and Communications,Information Systems,Software
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. AI-Powered Data Governance: A Cutting-Edge Method for Ensuring Data Quality for Machine Learning Applications;2024 Second International Conference on Emerging Trends in Information Technology and Engineering (ICETITE);2024-02-22