1. Optimizing parallel algorithms for all pairs similarity search
2. Arvind Arasu Venkatesh Ganti and Raghav Kaushik. Efficient exact set-similarity joins. In VLDB'06. Arvind Arasu Venkatesh Ganti and Raghav Kaushik. Efficient exact set-similarity joins. In VLDB'06.
3. Language Technologies Institute at Carnegie Mellon University. The clueweb09 dataset http://boston.lti.cs.cmu.edu/data/clueweb09. Language Technologies Institute at Carnegie Mellon University. The clueweb09 dataset http://boston.lti.cs.cmu.edu/data/clueweb09.