1. Apache hadoop. http://hadoop.apache.org/. Apache hadoop. http://hadoop.apache.org/.
2. The Stratosphere platform for big data analytics
3. Avoid groupbykey. http://databricks.gitbooks.io/databricks-spark-knowledge-base/content/best_practices/prefer_reducebykey_over_groupbykey.html. Avoid groupbykey. http://databricks.gitbooks.io/databricks-spark-knowledge-base/content/best_practices/prefer_reducebykey_over_groupbykey.html.
4. Compiler transformations for high-performance computing