Application of non parametric Bayesian methods in high dimensional data

Author:

Xia Yunqing

Abstract

With the development of technology and the widespread collection of data, high-dimensional data analysis has become a research hotspot in many fields. Traditional parameter methods often face problems such as dimensional disasters in high-dimensional data analysis. Non parametric methods have broad application prospects in high-dimensional data because they do not rely on specific parameter distribution assumptions. The Bayesian rule is more suitable for dealing with noise and outliers in high-dimensional data because it takes uncertainty into account. Therefore, it is of great significance to combine non parametric methods with Bayesian methods for application research in high-dimensional data analysis. In this paper, the nonparametric Bayesian method was applied to the analysis of high-dimensional data, and the Dirichlet process Mixture model was used to cluster high-dimensional data. The regression analysis of high-dimensional data was carried out through the prediction model of nonparametric Bayesian regression. In this paper, the nonparametric Bayesian method based on Bayesian sparse linear model was used for feature selection of high-dimensional data. In order to determine the superiority of nonparametric Bayesian methods in high-dimensional data analysis, this paper conducted experiments on nonparametric Bayesian methods and traditional parametric methods in high-dimensional data analysis from five aspects of cluster analysis, classification analysis, regression analysis, feature selection and anomaly detection, and evaluated them through multiple indicators. This article explored the application of non parametric Bayesian methods in high-dimensional data analysis from these aspects through simulation experiments. The experimental results show that the clustering accuracy of the non parametric Bayesian clustering algorithm was 0.93, and the accuracy of the non parametric Bayesian classification algorithm was between 0.93 and 0.99; the coefficient of determination of nonparametric Bayesian regression algorithm was 0.98; the F1 values of non parametric Bayesian methods in anomaly detection ranged from 0.86 to 0.91, which was superior to traditional methods. Non parametric Bayesian methods have broad application prospects in high-dimensional data analysis, and can be applied in multiple fields such as clustering, classification, regression, etc.

Publisher

IOS Press

Reference17 articles.

1. Visualizing structure and transitions in high-dimensional biological data;Moon Kevin;Nature Biotechnology,2020

2. Visualization of very large high-dimensional data sets as minimum spanning trees;Daniel;Journal of Cheminformatics,2020

3. Solving high-dimensional partial differential equations using deep learning;Han;Proceedings of the National Academy of Sciences,2018

4. Calculating Julia Fractal Sets In Any Embedding Dimension;Ricardo;Fractals-Complex Geometry Patterns and Scaling in Nature and Society,2023

5. ABC–CDE: Toward approximate Bayesian computation with complex high-dimensional data and limited simulations;Rafael;Journal of Computational and Graphical Statistics,2019

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3