Affiliation:
1. Tencent Technology Company Limited, Beijing 100086, China
Abstract
The purpose of this paper is to study the unbalanced data flow integration classification algorithm based on joint nonnegative matrix factorization, in order to solve the problem that the basic clustering results obtained from the original data set have some information loss, thereby reducing the effective information in the integration stage. In this paper, the accuracy of the unbalanced data and the detection time consumption are selected as the research object. Six data sets with imbalanced proportions of minority and majority samples are selected for experiments. Mathematical statistical analysis is first used to observe text classification, disease diagnosis, and network intrusion detection and the classification accuracy of majority class and minority class; the commonly used algorithm for unbalanced data is statistical analysis method. Comparing the univariate method for comprehensive classification of unbalanced data flow based on nonnegative matrix factorization with the unbalanced data algorithm, the observation has accurate rate and detects time-consuming changes. Among them, the comprehensive classification algorithm of unbalanced data flow is based on the classification of data, classifying the data, judging whether two data points belong to the same category, and determining their degree of balance. The research data shows that the unbalanced data flow integrated classification algorithm based on joint nonnegative matrix decomposition can reasonably evaluate the classification performance of the classifier for a few classes, and the detection speed is faster and saves more time. The experimental research shows that the algorithm combines the relationship matrix and information matrix from the original data set into a consensus function, uses NMF technology to obtain the membership matrix, effectively uses potential information, improves the accuracy rate of 69.73%, and shortens 71.65% of the time consumed.
Subject
Electrical and Electronic Engineering,Computer Networks and Communications,Information Systems
Reference25 articles.
1. Structure constrained nonnegative matrix factorization for pattern clustering and classification
2. Case study of the molecular classification and prognostic prediction of gastric cancer based on nonnegative matrix factorization;C. Ying-Ying;Journal of Shanghai Jiaotong University(Medical Science),2017
3. Single-Channel speech separation based on non-negative matrix factorization and factorial conditional random field;L. I. Xu;Acta Electronica Sinica,2018
4. Using deep neural networks along with dimensionality reduction techniques to assist the diagnosis of neurodegenerative disorders
5. Triplex Transfer Learning: Exploiting Both Shared and Distinct Concepts for Text Classification