1. J. Dean, S. Ghemawat, MapReduce: simplified data processing on large clusters, in: Proc. USENIX Conf. on Operating Systems Design and Implementation (OSDI), 2004, pp. 137–150.
2. M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M.J. Franklin, S. Shenker, I. Stoica, Resilient Distributed Datasets: a fault-tolerant abstraction for in-memory cluster computing, in: USENIX Conf. on Networked Systems Design and Implementation (NSDI), 2012.
3. M. Armbrust, R.S. Xin, C. Lian, Y. Huai, D. Liu, J.K. Bradley, X. Meng, T. Kaftan, M.J. Franklin, A. Ghodsi, et al., Spark SQL: Relational data processing in Spark, in: Proc. ACM Int. Conf. on Management of Data (SIGMOD), 2015, pp. 1383–1394.
4. MLlib: machine learning in Apache Spark;Meng;J. Machine Learning Res.,2016
5. Autoadmin “what-if” index analysis utility;Chaudhuri;SIGMOD Rec.,1998