1. Shvachko K, Kuang H, Radia S, Chansler R. The hadoop distributed file system. In: Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST). MSST’10. Washington, DC, USA: IEEE Computer Society; 2010. p. 1-10.
2. Zaharia M, Chowdhury M, Das T, Dave A, Ma J, McCauley M, et al. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation. NSDI’12. Berkeley, CA, USA: USENIX Association; 2012. p. 2-2.
3. Dean J, Ghemawat S. Mapreduce: Simplified data processing on large clusters. Commun ACM 2008; 51(1):107-113.
4. Low Y, Bickson D, Gonzalez J, Guestrin C, Kyrola A, Hellerstein JM. Distributed graphlab: A framework for machine learning and data mining in the cloud. Proc VLDB Endow 2012; 5(8):716-727.
5. Malewicz G, Austern MH, Bik AJ, Dehnert JC, Horn I, Leiser N, et al. Pregel: A system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data. SIGMOD ‘10. New York, NY, USA: ACM. 2010. p. 135-146.