Author:
Alarte Julian,Insa David,Silva Josep,Tamarit Salvador
Publisher
Springer International Publishing
Reference23 articles.
1. Adam, G., Bouras, C., Poulopoulos, V.: CUTER: an efficient useful text extraction mechanism. In: 2009 International Conference on Advanced Information Networking and Applications Workshops, pp. 703–708, May 2009
2. Alarte, J., Insa, D., Silva, J., Tamarit, S.: Automatic detection of webpages that share the same web template. In: ter Beek, M.H., Ravara, A. (eds.) Proceedings of the 10th International Workshop on Automated Specification and Verification of Web Systems (WWV 2014). Electronic Proceedings in Theoretical Computer Science, vol. 163, pp. 2–15. Open Publishing Association, July 2014
3. Lecture Notes in Computer Science;J Alarte,2016
4. Bar-Yossef, Z., Rajagopalan, S.: Template detection via data mining and its applications. In: Proceedings of the 11th International Conference on World Wide Web (WWW 2002), pp. 580–591. ACM, New York (2002)
5. Baroni, M., Chantree, F., Kilgarriff, A., Sharoff, S.: Cleaneval: a competition for cleaning web pages. In: Proceedings of the International Conference on Language Resources and Evaluation (LREC 2008), pp. 638–643. European Language Resources Association, May 2008
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Scraping Relevant Images from Web Pages without Download;ACM Transactions on the Web;2023-10-11
2. An Empirical Comparison of Web Content Extraction Algorithms;Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval;2023-07-18
3. Page-Level Main Content Extraction From Heterogeneous Webpages;ACM Transactions on Knowledge Discovery from Data;2021-06-28