Flydeling: Streamlined Performance Models for Hardware Acceleration of CNNs through System Identification

Author:

Carballo-Hernández Walther1ORCID,Pelcat Maxime2ORCID,Bhattacharyya Shuvra S.3ORCID,Galán Ricardo Carmona4ORCID,Berry François1ORCID

Affiliation:

1. Université Clermont Auvergne, CNRS, Clermont Auvergne INP, Institut Pascal, France

2. IETR, UMR CNRS 6164, Institut Pascal, UMR CNRS 6602, Univ Rennes, INSA Rennes, France

3. University of Maryland, College Park, USA and IETR, UMR CNRS 6164, Rennes, France

4. Instituto de Microelectrónica de Sevilla (IMSE-CNM), CSIC-Universidad de Sevilla, Spain

Abstract

The introduction of deep learning algorithms, such as Convolutional Neural Networks (CNNs) in many near-sensor embedded systems, opens new challenges in terms of energy efficiency and hardware performance. An emerging solution to address these challenges is to use tailored heterogeneous hardware accelerators combining processing elements of different architectural natures such as Central Processing Unit (CPU), Graphics Processing Unit (GPU), Field Programmable Gate Array (FPGA), or Application Specific Integrated Circuit (ASIC). To progress towards heterogeneity, a great asset would be an automated design space exploration tool that chooses, for each accelerated partition of a CNN, the most appropriate architecture considering available resources. To feed such a design space exploration process, models are required that provide very fast yet precise evaluations of alternative architectures or alternative forms of CNNs. Quick configuration estimation could be achieved with few parameters from representative input sequences. This article studies a solution called flydeling (as a contraction of fly weight mo deling ) for obtaining these models by inspiring from the black-box System Identification (SI) domain. We refer to models derived using the proposed approach as fly weight mo dels ( flydels ). A methodology is proposed to generate these flydels , using CNN properties as predictor features together with SI techniques with a stochastic excitation input at a feature map dimensions level. For an embedded CPU-FPGA-GPU heterogeneous platform, it is demonstrated that it is possible to learn these Key Performance Indicators (KPIs) flydels at an early design stage and from high-level application features. For latency, energy, and resource utilization, flydels obtain estimation errors varying between 5% and 10% with less model parameters compared to state-of-the-art solutions and are built automatically from platform measurements.

Funder

European Union’s Horizon 2020

Marie Skłodowska-Curie

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Networks and Communications,Hardware and Architecture,Safety, Risk, Reliability and Quality,Media Technology,Information Systems,Software,Computer Science (miscellaneous)

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3