DianNao family-Reference-Cited by-同舟云学术

DianNao family

Published:2016-10-28 Issue:11 Volume:59 Page:105-112
ISSN:0001-0782
Container-title:Communications of the ACM
language:en
Short-container-title:Commun. ACM

Author:

Chen Yunji¹,Chen Tianshi¹,Xu Zhiwei¹,Sun Ninghui¹,Temam Olivier²

Affiliation:

1. ICT, CAS, China

2. Inria Saclay, France

Abstract

Machine Learning (ML) tasks are becoming pervasive in a broad range of applications, and in a broad range of systems (from embedded systems to data centers). As computer architectures evolve toward heterogeneous multi-cores composed of a mix of cores and hardware accelerators, designing hardware accelerators for ML techniques can simultaneously achieve high efficiency and broad application scope. While efficient computational primitives are important for a hardware accelerator, inefficient memory transfers can potentially void the throughput, energy, or cost advantages of accelerators, that is, an Amdahl's law effect, and thus, they should become a first-order concern, just like in processors, rather than an element factored in accelerator design on a second step. In this article, we introduce a series of hardware accelerators (i.e., the DianNao family) designed for ML (especially neural networks), with a special emphasis on the impact of memory on accelerator design, performance, and energy. We show that, on a number of representative neural network layers, it is possible to achieve a speedup of 450.65x over a GPU, and reduce the energy by 150.31x on average for a 64-chip DaDianNao system (a member of the DianNao family).

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science

Link

https://dl.acm.org/doi/pdf/10.1145/2996864

Reference47 articles.

1. A Massively Parallel FPGA-Based Coprocessor for Support Vector Machines

2. A dynamically configurable coprocessor for convolutional neural networks

3. BenchNN: On the broad potential application scope of hardware neural network accelerators

4. DianNao

Cited by 132 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Edge intelligence: From deep learning's perspective;Digital Manufacturing;2024

2. A review of in-memory computing for machine learning: architectures, options;International Journal of Web Information Systems;2023-12-22

3. Improving Utilization of Dataflow Unit for Multi-Batch Processing.;ACM Transactions on Architecture and Code Optimization;2023-12-18

4. Design of The Ultra-Low-Power Driven VMM Configurations for μW Scale IoT Devices;2023 IEEE 16th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC);2023-12-18

5. A Design Flow for Scheduling Spiking Deep Convolutional Neural Networks on Heterogeneous Neuromorphic System-on-Chip;ACM Transactions on Embedded Computing Systems;2023-12-02