Learning Reliable Neural Networks with Distributed Architecture Representations-Reference-Cited by-同舟云学术

Learning Reliable Neural Networks with Distributed Architecture Representations

Published:2023-03-25 Issue:4 Volume:22 Page:1-20
ISSN:2375-4699
Container-title:ACM Transactions on Asian and Low-Resource Language Information Processing
language:en
Short-container-title:ACM Trans. Asian Low-Resour. Lang. Inf. Process.

Author:

Li Yinqiao¹^ORCID,Cao Runzhe¹^ORCID,He Qiaozhi²^ORCID,Xiao Tong³^ORCID,Zhu Jingbo³^ORCID

Affiliation:

1. School of Computer Science and Engineering, Northeastern University, Shenyang, Liaoning, China

2. Tencent, China

3. School of Computer Science and Engineering, Northeastern University and also NiuTrans Research, Shenyang, Liaoning, China

Abstract

Neural architecture search (NAS) has shown the strong performance of learning neural models automatically in recent years. But most NAS systems are unreliable due to the architecture gap brought by discrete representations of atomic architectures. In this article, we improve the performance and robustness of NAS via narrowing the gap between architecture representations. More specifically, we apply a general contraction mapping to model neural networks with distributed representations (Neural Architecture Search with Distributed Architecture Representations (ArchDAR)). Moreover, for a better search result, we present a joint learning approach to integrating distributed representations with advanced architecture search methods. We implement our ArchDAR in a differentiable architecture search model and test learned architectures on the language modeling task. On the Penn Treebank data, it outperforms a strong baseline significantly by 1.8 perplexity scores. Also, the search process with distributed representations is more stable, which yields a faster structural convergence when it works with the differentiable architecture search model.

Funder

National Science Foundation of China

National Key R&D Project of China

China HTRD Center

Yunnan Provincial Major Science and Technology Special Plan Projects

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science

Link

https://dl.acm.org/doi/pdf/10.1145/3578709

Reference46 articles.

1. Stability Issues in RNN Architectures

2. An evolutionary algorithm that constructs recurrent neural networks

3. Training Deeper Neural Machine Translation Models with Transparent Attention

4. Gabriel Bender, Pieter-Jan Kindermans, Barret Zoph, Vijay Vasudevan, and Quoc V. Le. 2018. Understanding and simplifying one-shot architecture search. In Proceedings of the 35th International Conference on Machine Learning (ICML’18), Proceedings of Machine Learning Research, Jennifer G. Dy and Andreas Krause (Eds.), Vol. 80. PMLR, 549–558.

5. Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language models are few-shot learners. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems (NeurIPS’20), Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (Eds.).

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Technology of NiuTrans Open Source Statistical Machine Translation System;2024 International Conference on Integrated Circuits and Communication Systems (ICICACS);2024-02-23

2. Multimodal Social Data Analytics on the Design and Implementation of an EEG-Mechatronic System Interface;Journal of Data and Information Quality;2023-09-28

3. Modeling Extractive Question Answering Using Encoder-Decoder Models with Constrained Decoding and Evaluation-Based Reinforcement Learning;Mathematics;2023-03-27