1. Mapreduce: simplified data processing on large clusters;Dean;Commun. ACM,2008
2. Apache Spark website, http://Spark.apache.org/.
3. Apache Hadoop website, http://hadoop.apache.org/.
4. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing;Zaharia,2012
5. Mrapid: An efficient short job optimizer on hadoop;Zhang,2017