Building a Technology Recommender System Using Web Crawling and Natural Language Processing Technology-Reference-Cited by-同舟云学术

Building a Technology Recommender System Using Web Crawling and Natural Language Processing Technology

Published:2022-08-03 Issue:8 Volume:15 Page:272
ISSN:1999-4893
Container-title:Algorithms
language:en
Short-container-title:Algorithms

Author:

Campos Macias Nathalie^ORCID,Düggelin Wilhelm,Ruf Yesim^ORCID,Hanne Thomas^ORCID

Abstract

Finding, retrieving, and processing information on technology from the Internet can be a tedious task. This article investigates if technological concepts such as web crawling and natural language processing are suitable means for knowledge discovery from unstructured information and the development of a technology recommender system by developing a prototype of such a system. It also analyzes how well the resulting prototype performs in regard to effectivity and efficiency. The research strategy based on design science research consists of four stages: (1) Awareness generation; (2) suggestion of a solution considering the information retrieval process; (3) development of an artefact in the form of a Python computer program; and (4) evaluation of the prototype within the scope of a comparative experiment. The evaluation yields that the prototype is highly efficient in retrieving basic and rather random extractive text summaries from websites that include the desired search terms. However, the effectivity, measured by the quality of results is unsatisfactory due to the aforementioned random arrangement of extracted sentences within the resulting summaries. It is found that natural language processing and web crawling are indeed suitable technologies for such a program whilst the use of additional technology/concepts would add significant value for a potential user. Several areas for incremental improvement of the prototype are identified.

Publisher

MDPI AG

Subject

Computational Mathematics,Computational Theory and Mathematics,Numerical Analysis,Theoretical Computer Science

Link

https://www.mdpi.com/1999-4893/15/8/272/pdf

Reference42 articles.

1. A content-based recommender system for computer science publications

2. Personalized Academic Research Paper Recommendation System http://arxiv.org/abs/1304.5457

3. Keyword weight optimization using gradient strategies in event focused web crawling

4. Sentiment-Focused Web Crawling

5. An intelligent system for focused crawling from Big Data sources

Cited by 6 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Collection and Preprocessing of Data for LLM in the Kazakh Language in the Field of Legislation;Communications in Computer and Information Science;2024

2. An Artificial-Intelligence-Driven Spanish Poetry Classification Framework;Big Data and Cognitive Computing;2023-12-14

3. Utilizing Python for Web Scraping and Incremental Data Extraction;2023 2nd International Conference on Automation, Computing and Renewable Systems (ICACRS);2023-12-11

4. Multilingual Text Summarization for German Texts Using Transformer Models;Information;2023-05-25

5. An App-Based Recommender System Based on Contrasting Automobiles;Processes;2023-03-15