Affiliation:
1. National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China
Abstract
Abstract
Current deep-learning models are mostly built upon neural networks, i.e. multiple layers of parameterized differentiable non-linear modules that can be trained by backpropagation. In this paper, we explore the possibility of building deep models based on non-differentiable modules such as decision trees. After a discussion about the mystery behind deep neural networks, particularly by contrasting them with shallow neural networks and traditional machine-learning techniques such as decision trees and boosting machines, we conjecture that the success of deep neural networks owes much to three characteristics, i.e. layer-by-layer processing, in-model feature transformation and sufficient model complexity. On one hand, our conjecture may offer inspiration for theoretical understanding of deep learning; on the other hand, to verify the conjecture, we propose an approach that generates deep forest holding these characteristics. This is a decision-tree ensemble approach, with fewer hyper-parameters than deep neural networks, and its model complexity can be automatically determined in a data-dependent way. Experiments show that its performance is quite robust to hyper-parameter settings, such that in most cases, even across different data from different domains, it is able to achieve excellent performance by using the same default setting. This study opens the door to deep learning based on non-differentiable modules without gradient-based adjustment, and exhibits the possibility of constructing deep models without backpropagation.
Funder
National Natural Science Foundation of China
Collaborative Innovation Center of Novel Software Technology and Industrialization
Publisher
Oxford University Press (OUP)
Reference55 articles.
1. Deep learning models in finance;Sirignano;SIAM News,2017
2. Gradient-based learning applied to document recognition;LeCun;Proc IEEE,1998
3. ImageNet classification with deep convolutional neural networks;Krizhenvsky,2012
4. Very deep convolutional networks for large-scale image recognition;Simonyan
Cited by
386 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献