1. Hobbs, J., Appelt, D., Bear, J.: A Cascaded Finite-State Transducer for Extracting Information from Natural Language Text. In: Finite State Devices for Natural Language Processing. MLT Press, Cambridge (2002)
2. Gupta, S., Kaiser, G., Neistadt, D.: DOM-based content extraction of HTML documents. In: Proc. of 12th Int. World Wide Web Conference, pp. 207–214. ACM Press, New York (2003)
3. McKeown, K.R., Barzilay, R., Evans, D., Hatzivassiloglou, V., et al.: Columbia Multi-document Summarization: Approach and Evaluation. In: Document Understanding Conf., pp. 156–172 (2006)
4. Kaasinen, E., Aaltonen, M., Kolari, J., Melakoski, S., et al.: Two Approaches to Bringing Internet Services to WAP Devices. In: Proc. of 9th Int. World-Wide Web Conf., pp. 342–348 (2004)
5. Li, X., Shi, Z.: Innovating web page classification through reducing noise. Journal of Computer Science & Technology 17(1), 9–17 (2007)