1. Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. In: OSDI 2004, p. 10 (2004)
2. Pang-Ning, T., Steinbach, M., Kumar, V.: Introduction to Data Mining. Addison-Wesley (2006)
3. Borthakur, D.: The hadoop distributed file system: Architecture and design. Hadoop Project 11, 21 (2007)
4. Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M.J., Shenker, S., Stoica, I.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: NSDI 2012, p. 2 (2012)
5. The Apache Mahout machine learning library (2013).
http://mahout.apache.org/