Abstract
AbstractDeep learning methods have achieved state-of-the-art performance in many domains of artificial intelligence, but are typically hard to interpret. Network interpretation is important for multiple reasons, including knowledge discovery, hypothesis generation, fairness and establishing trust. Model transformations provide a general approach to interpreting a trained network post-hoc: the network is approximated by a model, which is typically compressed, whose structure can be more easily interpreted in some way (we call such approaches interpretability schemes). However, the relationship between compression and interpretation has not been fully explored: How much should a network be compressed for optimal extraction of interpretable information? Should compression be combined with other criteria when selecting model transformations? We investigate these issues using two different compression-based schemes, which aim to extract orthogonal kinds of information, pertaining to feature and data instance-based groupings respectively. The first (rank projection trees) uses a structured sparsification method such that nested groups of features can be extracted having potential joint interactions. The second (cascaded network decomposition) splits a network into a cascade of simpler networks, allowing groups of training instances with similar characteristics to be extracted at each stage of the cascade. We use predictive tasks in cancer and psychiatric genomics to assess the ability of these approaches to extract informative feature and data-point groupings from trained networks. We show that the generalization error of a network provides an indicator of the quality of the information extracted; further we derive PAC-Bayes generalization bounds for both schemes, which we show can be used as proxy indicators, and can thus provide a criterion for selecting the optimal compression. Finally, we show that the PAC-Bayes framework can be naturally modified to incorporate additional criteria alongside compression, such as prior knowledge based on previous models, which can enhance interpretable model selection.
Publisher
Cold Spring Harbor Laboratory
Reference39 articles.
1. Definitions, methods, and applications in interpretable machine learning
2. Wang, D. , Liu, S. , Warrell, J. , Won, H. , Shi, X. , Navarro, F.C. , Clarke, D. , Gu, M. , Emani, P. , Yang, Y.T. , Xu, M. , …, and Gerstein, M.B. , 2018. Comprehensive functional genomic resource and integrative model for the human brain. Science, 362(6420).
3. Using deep learning to model the hierarchical structure and function of a cell;Nature methods,2018
4. Deep learning and process understanding for data-driven Earth system science
5. Adel, T. , Ghahramani, Z. and Weller, A. , 2018, July. Discovering interpretable representations for both deep generative and discriminative models. In International Conference on Machine Learning (pp. 50–59).
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献