Integrated Classification Algorithm for Unbalanced Data Streams Based on Joint Nonnegative Matrix Factorization

Author:

Li Jin1,Zhao Ruibo1ORCID

Affiliation:

1. Tencent Technology Company Limited, Beijing 100086, China

Abstract

The purpose of this paper is to study the unbalanced data flow integration classification algorithm based on joint nonnegative matrix factorization, in order to solve the problem that the basic clustering results obtained from the original data set have some information loss, thereby reducing the effective information in the integration stage. In this paper, the accuracy of the unbalanced data and the detection time consumption are selected as the research object. Six data sets with imbalanced proportions of minority and majority samples are selected for experiments. Mathematical statistical analysis is first used to observe text classification, disease diagnosis, and network intrusion detection and the classification accuracy of majority class and minority class; the commonly used algorithm for unbalanced data is statistical analysis method. Comparing the univariate method for comprehensive classification of unbalanced data flow based on nonnegative matrix factorization with the unbalanced data algorithm, the observation has accurate rate and detects time-consuming changes. Among them, the comprehensive classification algorithm of unbalanced data flow is based on the classification of data, classifying the data, judging whether two data points belong to the same category, and determining their degree of balance. The research data shows that the unbalanced data flow integrated classification algorithm based on joint nonnegative matrix decomposition can reasonably evaluate the classification performance of the classifier for a few classes, and the detection speed is faster and saves more time. The experimental research shows that the algorithm combines the relationship matrix and information matrix from the original data set into a consensus function, uses NMF technology to obtain the membership matrix, effectively uses potential information, improves the accuracy rate of 69.73%, and shortens 71.65% of the time consumed.

Publisher

Hindawi Limited

Subject

Electrical and Electronic Engineering,Computer Networks and Communications,Information Systems

Reference25 articles.

1. Structure constrained nonnegative matrix factorization for pattern clustering and classification

2. Case study of the molecular classification and prognostic prediction of gastric cancer based on nonnegative matrix factorization;C. Ying-Ying;Journal of Shanghai Jiaotong University(Medical Science),2017

3. Single-Channel speech separation based on non-negative matrix factorization and factorial conditional random field;L. I. Xu;Acta Electronica Sinica,2018

4. Using deep neural networks along with dimensionality reduction techniques to assist the diagnosis of neurodegenerative disorders

5. Triplex Transfer Learning: Exploiting Both Shared and Distinct Concepts for Text Classification

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3