Abstract
Access plan recommendation is a query optimization approach that executes new queries using prior created query execution plans (QEPs). The query optimizer divides the query space into clusters in the mentioned method. However, traditional clustering algorithms take a significant amount of execution time for clustering such large datasets. The MapReduce distributed computing model provides efficient solutions for storing and processing vast quantities of data. Apache Spark and Apache Hadoop frameworks are used in the present investigation to cluster different sizes of query datasets in the MapReduce-based access plan recommendation method. The performance evaluation is performed based on execution time. The results of the experiments demonstrated the effectiveness of parallel query clustering in achieving high scalability. Furthermore, Apache Spark achieved better performance than Apache Hadoop, reaching an average speedup of 2x.
Subject
General Mathematics,Engineering (miscellaneous),Computer Science (miscellaneous)
Reference26 articles.
1. Multi-objective Parametric Query Optimization for Distributed Database Systems;Singh,2016
2. Efficient query processing on distributed stream processing engine;Han;Proceedings of the 11th International Conference on Ubiquitous Information Management and Communication,2017
3. Join query optimization in the distributed database system using an artificial bee colony algorithm and genetic operators
4. Plan selection based on query clustering;Ghosh;Proceedings of the VLDB’02: Proceedings of the 28th International Conference on Very Large Databases,2002
5. A Recommendation System for Execution Plans Using Machine Learning
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献