Author:
Jung Geunseong,Cha Jaehyuk
Funder
National Research Foundation of Korea
Subject
General Earth and Planetary Sciences,General Environmental Science
Reference22 articles.
1. H. Han and T. Tokuda, A personal web information/knowledge retrieval system, in Information Modelling and Knowledge Bases XIX, Amsterdam: IOS Press, pp. 338-345, 2008.
2. Y. Yesilada, “Web page segmentation: A review,” EMINE Technical Report. Middle East Tech. Univ. Northern Cyprus Campus. Deliverable 0 (D0), 1-39, March 2011.
3. Browserless Web Data Extraction
4. Web2Text: Deep Structured Boilerplate Removal
5. A. Barbaresi, “Trafilatura: A web scraping library and command-line tool for text discovery and extraction,” in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations, Virtual event, pp. 122-131, August 2021.