1. [n.d.]. AggregateByKey. https://spark.apache.org/docs/2.1.1/api/java/org/apache/spark/rdd/PairRDDFunctions.html. [n.d.]. AggregateByKey. https://spark.apache.org/docs/2.1.1/api/java/org/apache/spark/rdd/PairRDDFunctions.html.
2. [n.d.]. Hadoop. http://hadoop.apache.org/. [n.d.]. Hadoop. http://hadoop.apache.org/.
3. [n.d.]. Spark. https://spark.apache.org/. [n.d.]. Spark. https://spark.apache.org/.
4. Dynamic program slicing
5. Techniques for efficiently querying scientific workflow provenance graphs