Abstract
In this article, we offer some suggestions for anomaly detection on the data received from the source to the Data warehouse. As a result, it is aimed to prevent the entry of dirty and noisy data into the data warehouse. We think that knowing that there is clean and healthy data in the data warehouse will be resistant to anomalies in the processed data used for data science. In order to reach our goal, studies were carried out on the data in the retail sector. We aimed to determine our theoretical thoughts from some topics such as user erroneous login data in the retail and energy industry, abnormal sales over employees during the campaign period, product stock abnormality, and incorrect pricing. When we examined many studies, we saw that they made anomaly detection after estimation. Before taking the data from the source to the data warehouse, we thought that anomaly detection would be more efficient and healthier. Analysis and results were evaluated on the data obtained in the wiseboard retail project of Gtech company.
Publisher
Orclever Science and Research Group