Web Page Classification Algorithm Based on Deep Learning

Author:

Yu Yuanhui1ORCID

Affiliation:

1. School of Computer Engineering, JiMei University, Xiamen 361021, Fujian, China

Abstract

Transmit and process information to establish a learning mechanism and realize the processing of image data and sound data. However, the current research on Web page classification algorithm (WPCA) based on deep learning (DL) is not in-depth. Therefore, the main research of this article is the research of WPCA based on DL. This article first uses the keyword weight calculation method to reduce the impact of a small number of high-frequency words in the web page document on the weight calculation and reduces the value of the low-frequency word weights so that the WPCA is more accurate in the calculation process; second, the use of Chinese web pages: the classification method calculates the similarity between the text to be classified and all the class templates and then determines the category of all texts according to the similarity and certain classification rules; finally, in order to improve the learning rate of DL, consider using adaptive parameters. The optimization algorithm automatically adjusts the size of the learning rate, making the research of WPCA based on DL more efficient. After comparing the DL-based WPCA with the traditional algorithm, the data shows that in terms of time expenditure, the DL WPCA is 354 s, the traditional algorithm is 2436 s; in terms of memory overhead, the DL WPCA is 6.35 s, the traditional algorithm is 186.25 s. The experimental results show that WPCA based on DL are faster and more efficient than traditional algorithms and consume less system memory.

Funder

Ministry of Education of the People's Republic of China

Publisher

Hindawi Limited

Subject

General Mathematics,General Medicine,General Neuroscience,General Computer Science

Reference19 articles.

1. Integration of digital twin and deep learning in cyber-physical systems: towards;J. Lee;Smart Manufacturing,2020

2. Recent progresses in deep learning based acoustic models;Y. Dong;IEEE/CAA Journal of Automatica Sinica,2017

3. Hierarchical Contaminated Web Page Classification Based on Meta Tag Denoising Disposal

4. CSI-based fingerprinting for indoor localization: a DL approach;X. Wang;IEEE Transactions on Vehicular Technology,2017

5. AggNet: Deep Learning From Crowds for Mitosis Detection in Breast Cancer Histology Images

Cited by 3 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Analyzing the likeness of a person based on DNS logs using machine learning;2023 International Conference on Signal Processing, Computation, Electronics, Power and Telecommunication (IConSCEPT);2023-05-25

2. Web Page Prediction Model using Machine Learning Approaches: A Review;2023 International Conference on Science, Engineering and Business for Sustainable Development Goals (SEB-SDG);2023-04-05

3. Contextual Embeddings-Based Web Page Categorization Using the Fine-Tune BERT Model;Symmetry;2023-02-02

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3