1. Baroni, M., Chantree, F., Kilgarriff, A., Sharoff, S.: CleanEval: a competition for cleaning web pages. In: LREC (2008)
2. Bauer, D., Degen, J., Deng, X., Herger, P., Gasthaus, J., Giesbrecht, E., Jansen, L., Kalina, C., Kräger, T., Märtin, R., Schmidt, M., Scholler, S., Steger, J., Stemle, E., Evert, S.: FIASCO: filtering the internet by automatic subtree classification, Osnabruck. In: Building and Exploring Web Corpora: Proceedings of the 3rd Web as Corpus Workshop, Incorporating CleanEval, vol. 4, pp. 111–121 (2007)
3. Chakrabarti, D., Kumar, R., Punera, K.: Page-level template detection via isotonic smoothing. In: Proceedings of the 16th International Conference on World Wide Web, pp. 61–70. ACM (2007)
4. Chakrabarti, D., Kumar, R., Punera, K.: A graph-theoretic approach to webpage segmentation. In Proceedings of the 17th International Conference on World Wide Web, pp. 377–386. ACM (2008)
5. Collins-Thompson, K., Bennett, P., Diaz, F., Clarke, C., Voorhees, E.: Overview of the TREC 2013 web track. In: Proceedings of the 22nd Text Retrieval Conference (TREC 2013) (2013)