Random Projection‐Based Locality‐Sensitive Hashing in a Memristor Crossbar Array with Stochasticity for Sparse Self‐Attention‐Based Transformer

Author:

Wang Xinxin1,Valov Ilia23,Li Huanglong14ORCID

Affiliation:

1. Department of Precision Instrument, Center for Brain Inspired Computing Research Tsinghua University Beijing 100084 P. R. China

2. Forschungszentrum Jülich, Institute of Electrochemistry and Energy System WilhelmJohnen‐Straße 52426 Jülich Germany

3. “Acad. Evgeni Budevski” IEE‐BAS Bulgarian Academy of Sciences (BAS) Acad. G. Bonchev Str, Block 10 Sofia 1113 Bulgaria

4. Chinese Institute for Brain Research Beijing 102206 P. R. China

Abstract

AbstractSelf‐attention mechanism is critically central to the state‐of‐the‐art transformer models. Because the standard full self‐attention has quadratic complexity with respect to the input's length L, resulting in prohibitively large memory for very long sequences, sparse self‐attention enabled by random projection (RP)‐based locality‐sensitive hashing (LSH) has recently been proposed to reduce the complexity to O(L log L). However, in current digital computing hardware with a von Neumann architecture, RP, which is essentially a matrix multiplication operation, incurs unavoidable time and energy‐consuming data shuttling between off‐chip memory and processing units. In addition, it is known that digital computers simply cannot generate provably random numbers. With the emerging analog memristive technology, it is shown that it is feasible to harness the intrinsic device‐to‐device variability in the memristor crossbar array for implementing the RP matrix and perform RP‐LSH computation in memory. On this basis, sequence prediction tasks are performed with a sparse self‐attention‐based Transformer in a hybrid software‐hardware approach, achieving a testing accuracy over 70% with much less computational complexity. By further harnessing the cycle‐to‐cycle variability for multi‐round hashing, 12% increase in the testing accuracy is demonstrated. This work extends the range of applications of memristor crossbar arrays to the state‐of‐the‐art large language models (LLMs).

Funder

National Natural Science Foundation of China

Key Technologies Research and Development Program

CAST Innovation Foundation

Publisher

Wiley

Reference52 articles.

1. A review on the attention mechanism of deep learning

2. Attention mechanisms in computer vision: A survey

3. A.Vaswani N.Shazeer N.Parmar J.Uszkoreit L.Jones A. N.Gomez Ł.Kaiser I.Polosukhin presented atNIPS2017 30.

4. Long Short-Term Memory

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3