1. Hu, G., & Zhao, Q. (2010). Study to eliminating noisy information in web pages based on data mining. Sixth International Conference on Natural Computation (ICNC 2010) (pp. 660–663).
2. Nithya, P., & Sumathi, P. (2012). Novel pre-processing technique for web log mining by removing global noise and web robots. National Conference on Computing and Communication Systems (NCCCS) (pp. 1–5). doi:
10.1109/NCCCS.2012.6412976
.
3. Lan, Y., Bing, L., & Xiaoli, L. X. (2003). Eliminating noisy information from web pages for data mining. In Proceedings of the ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 296–305).
4. Android Development Tools.
http://www.mkyong.com/java/jsoup-html-parser-hello-world-examples
.
5. Suhit, G., Gail, K., David, N., & Peter, G. (2003). DOM-based content extraction of HTML documents. In Proceeding WWW ‘03 Proceedings of the 12th International Conference on World Wide Web (pp. 207–214).