Language Model Using Neural Turing Machine Based on Localized Content-Based Addressing-Reference-Cited by-同舟云学术

Language Model Using Neural Turing Machine Based on Localized Content-Based Addressing

Published:2020-10-15 Issue:20 Volume:10 Page:7181
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Lee Donghyun,Park Jeong-Sik^ORCID,Koo Myoung-Wan,Kim Ji-Hwan

Abstract

The performance of a long short-term memory (LSTM) recurrent neural network (RNN)-based language model has been improved on language model benchmarks. Although a recurrent layer has been widely used, previous studies showed that an LSTM RNN-based language model (LM) cannot overcome the limitation of the context length. To train LMs on longer sequences, attention mechanism-based models have recently been used. In this paper, we propose a LM using a neural Turing machine (NTM) architecture based on localized content-based addressing (LCA). The NTM architecture is one of the attention-based model. However, the NTM encounters a problem with content-based addressing because all memory addresses need to be accessed for calculating cosine similarities. To address this problem, we propose an LCA method. The LCA method searches for the maximum of all cosine similarities generated from all memory addresses. Next, a specific memory area including the selected memory address is normalized with the softmax function. The LCA method is applied to pre-trained NTM-based LM during the test stage. The proposed architecture is evaluated on Penn Treebank and enwik8 LM tasks. The experimental results indicate that the proposed approach outperforms the previous NTM architecture.

Funder

National Research Foundation of Korea

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/10/20/7181/pdf

Reference37 articles.

1. A neural probabilistic language model;Bengio;J. Mach. Learn. Res.,2003

2. Long short-term memory;Hochreiter;Neural. Comput.,1997

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Accented Speech Recognition Based on End-to-End Domain Adversarial Training of Neural Networks;Applied Sciences;2021-09-10