1. URL-based web page classification – a new method for URL-based web page classification using n-gram language models;Abdallah,2014
2. Alexa, 2012. Alexa top 500 sites ranking. http://www.alexa.com/topsites (accessed 1.03.12).
3. Do not crawl in the dust: different URLs with similar text;Bar-Yossef;Trans. Web,2009
4. Template detection via data mining and its applications;Bar-Yossef,2002
5. A comprehensive study of features and algorithms for URL-based topic classification;Baykan;Trans. Web,2011