Predictive intelligence of reliable analytics in distributed computing environments-Reference-Cited by-同舟云学术

Predictive intelligence of reliable analytics in distributed computing environments

Published:2020-05-14 Issue:10 Volume:50 Page:3219-3238
ISSN:0924-669X
Container-title:Applied Intelligence
language:en
Short-container-title:Appl Intell

Author:

Kathidjiotis Yiannis,Kolomvatsos Kostas,Anagnostopoulos Christos

Abstract

Abstract Lack of knowledge in the underlying data distribution in distributed large-scale data can be an obstacle when issuing analytics & predictive modelling queries. Analysts find themselves having a hard time finding analytics/exploration queries that satisfy their needs. In this paper, we study how exploration query results can be predicted in order to avoid the execution of ‘bad’/non-informative queries that waste network, storage, financial resources, and time in a distributed computing environment. The proposed methodology involves clustering of a training set of exploration queries along with the cardinality of the results (score) they retrieved and then using query-centroid representatives to proceed with predictions. After the training phase, we propose a novel refinement process to increase the reliability of predicting the score of new unseen queries based on the refined query representatives. Comprehensive experimentation with real datasets shows that more reliable predictions are acquired after the proposed refinement method, which increases the reliability of the closest centroid and improves predictability under the right circumstances.

Funder

H2020 Marie Skłodowska-Curie Actions

Publisher

Springer Science and Business Media LLC

Subject

Artificial Intelligence

Link

http://link.springer.com/content/pdf/10.1007/s10489-020-01712-5.pdf

Reference22 articles.

1. Aboulnaga A, Chaudhuri S (1999) Self-tuning histograms: building histograms without looking at data, pp 181–192

2. Anagnostopoulos C, Savva F, Triantafillou P (2018) Scalable aggregation predictive analytics. Appl Intell 48(9):2546–2567

3. Anagnostopoulos C, Triantafillou P (2015) Learning to accurately count with query-driven predictive analytics. In: 2015 IEEE international conference on big data (big data), pp 14–23