DDNAS: Discretized Differentiable Neural Architecture Search for Text Classification-Reference-Cited by-同舟云学术

DDNAS: Discretized Differentiable Neural Architecture Search for Text Classification

Published:2023-10-03 Issue:5 Volume:14 Page:1-22
ISSN:2157-6904
Container-title:ACM Transactions on Intelligent Systems and Technology
language:en
Short-container-title:ACM Trans. Intell. Syst. Technol.

Author:

Chen Kuan-Chun¹^ORCID,Li Cheng-Te¹^ORCID,Lee Kuo-Jung¹^ORCID

Affiliation:

1. National Cheng Kung University, Taiwan

Abstract

Neural Architecture Search (NAS) has shown promising capability in learning text representation. However, existing text-based NAS neither performs a learnable fusion of neural operations to optimize the architecture nor encodes the latent hierarchical categorization behind text input. This article presents a novel NAS method, Discretized Differentiable Neural Architecture Search (DDNAS), for text representation learning and classification. With the continuous relaxation of architecture representation, DDNAS can use gradient descent to optimize the search. We also propose a novel discretization layer via mutual information maximization, which is imposed on every search node to model the latent hierarchical categorization in text representation. Extensive experiments conducted on eight diverse real datasets exhibit that DDNAS can consistently outperform the state-of-the-art NAS methods. While DDNAS relies on only three basic operations, i.e., convolution, pooling, and none, to be the candidates of NAS building blocks, its promising performance is noticeable and extensible to obtain further improvement by adding more different operations.

Funder

National Science and Technology Council (NSTC) of Taiwan

Center for Data Science

NCKU Miin Wu School of Computing

Institute of Information Science (IIS), Academia Sinica

Publisher

Association for Computing Machinery (ACM)

Subject

Artificial Intelligence,Theoretical Computer Science

Link

https://dl.acm.org/doi/pdf/10.1145/3610299

Reference55 articles.

1. Devansh Arpit Huan Wang Caiming Xiong Richard Socher and Yoshua Bengio. 2020. Neural Bayes: A Generic Parameterization Method for Unsupervised Representation Learning. arxiv:2002.09046. [stat.ML].

2. Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In International Conference on Learning Representations (ICLR).

3. Mohamed Ishmael Belghazi, Aristide Baratin, Sai Rajeshwar, Sherjil Ozair, Yoshua Bengio, Aaron Courville, and Devon Hjelm. 2018. Mutual information neural estimation. In Proceedings of the 35th International Conference on Machine Learning. 531–540.

4. Y-Lan Boureau, Jean Ponce, and Yann LeCun. 2010. A theoretical analysis of feature pooling in visual recognition. In Proceedings of the 27th International Conference on International Conference on Machine Learning (ICML’10). 111–118.

5. Andrew Brock, Theo Lim, J. M. Ritchie, and Nick Weston. 2018. SMASH: One-shot model architecture search through hypernetworks. In International Conference on Learning Representations (ICLR).

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Hierarchical Text Classification and Its Foundations: A Review of Current Research;Electronics;2024-03-25