1. https://hadoop.apache.org/
2. http://www.cs.rpi.edu/zaki/Workshops/FIMI/data/
3. Babu, S.: Towards automatic optimization of MapReduce programs. In: Proceedings of the 1st ACM Symposium on Cloud Computing, pp. 137–142. ACM (2010)
4. Bansal, G., Gupta, A., Pyne, U., Singhal, M., Banerjee, S.: A framework for performance analysis and tuning in hadoop based clusters. In: Smarter Planet and Big Data Analytics Workshop (SPBDA 2014), held in conjunction with International Conference on Distributed Computing and Networking (ICDCN 2014), Coimbatore, India (2014)
5. Barry, D., Tinetti, F.G., Real, I., Jaramillo, R.: Hadoop scalability and performance testing in heterogeneous clusters, July 2015