Exploring deep neural networks via layer-peeled model: Minority collapse in imbalanced training-Reference-Cited by-同舟云学术

Exploring deep neural networks via layer-peeled model: Minority collapse in imbalanced training

Published:2021-10-21 Issue:43 Volume:118 Page:e2103091118
ISSN:0027-8424
Container-title:Proceedings of the National Academy of Sciences
language:en
Short-container-title:Proc Natl Acad Sci USA

Author:

Fang Cong^ORCID,He Hangfeng,Long Qi^ORCID,Su Weijie J.^ORCID

Abstract

In this paper, we introduce the Layer-Peeled Model, a nonconvex, yet analytically tractable, optimization program, in a quest to better understand deep neural networks that are trained for a sufficiently long time. As the name suggests, this model is derived by isolating the topmost layer from the remainder of the neural network, followed by imposing certain constraints separately on the two parts of the network. We demonstrate that the Layer-Peeled Model, albeit simple, inherits many characteristics of well-trained neural networks, thereby offering an effective tool for explaining and predicting common empirical patterns of deep-learning training. First, when working on class-balanced datasets, we prove that any solution to this model forms a simplex equiangular tight frame, which, in part, explains the recently discovered phenomenon of neural collapse [V. Papyan, X. Y. Han, D. L. Donoho, Proc. Natl. Acad. Sci. U.S.A. 117, 24652–24663 (2020)]. More importantly, when moving to the imbalanced case, our analysis of the Layer-Peeled Model reveals a hitherto-unknown phenomenon that we term Minority Collapse, which fundamentally limits the performance of deep-learning models on the minority classes. In addition, we use the Layer-Peeled Model to gain insights into how to mitigate Minority Collapse. Interestingly, this phenomenon is first predicted by the Layer-Peeled Model before being confirmed by our computational experiments.

Publisher

Proceedings of the National Academy of Sciences

Subject

Multidisciplinary

Reference67 articles.

1. ImageNet classification with deep convolutional neural networks;Krizhevsky;Commun. ACM,2017

2. Deep learning

3. Mastering the game of Go with deep neural networks and tree search

4. Prevalence of neural collapse during the terminal phase of deep learning training

5. The optimised internal representation of multilayer classifier networks performs nonlinear discriminant analysis;Webb;Neural Netw.,1990

Cited by 28 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Probabilistic Contrastive Learning for Long-Tailed Visual Recognition;IEEE Transactions on Pattern Analysis and Machine Intelligence;2024-09

2. Leveraging local data sampling strategies to improve federated learning;International Journal of Data Science and Analytics;2024-08-29

3. FedNLR: Federated Learning with Neuron-wise Learning Rates;Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining;2024-08-24

4. Neural Collapse Inspired Debiased Representation Learning for Min-max Fairness;Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining;2024-08-24

5. Neural Collapse Anchored Prompt Tuning for Generalizable Vision-Language Models;Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining;2024-08-24