Dissecting Cyclops

Author:

Almási George1,Caşcaval Cǎlin1,Castaños José G.1,Denneau Monty1,Lieber Derek1,Moreira José E.1,Warren Henry S.1

Affiliation:

1. IBM Thomas J. Watson Research Center, Yorktown Heights, NY

Abstract

Multiprocessor systems-on-a-chip offer a structured approach to managing complexity in chip design. Cyclops is a new family of multithreaded architectures which integrates processing logic, main memory and communications hardware on a single chip. Its simple, hierarchical design allows the hardware architect to manage a large number of components to meet the design constraints in terms of performance, power or application domain.This paper evaluates several alternative Cyclops designs with different relative costs and trade-offs. We compare the performance of several scientific kernels running on different configurations of this architecture. We show that by increasing the number of threads sharing a floating point unit we can hide fairly high cache and memory latencies. We prove that we can reach the theoretical peak performance of the chip and we identify the optimal balance of components for each application. We demonstrate that the design is well adapted to solve problems that are difficult to optimize. For example, we show that sparse matrix vector multiplication obtains 16 GFlops out of 32 GFlops of peak performance.

Publisher

Association for Computing Machinery (ACM)

Cited by 21 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. A Survey on the Proposed Architectures for Efficient Execution of Irregular Applications Using Pipeline Parallelism;2023 Congress in Computer Science, Computer Engineering, & Applied Computing (CSCE);2023-07-24

2. Development of Collision Avoidance System in Slippery Road Conditions;IEEE Transactions on Intelligent Transportation Systems;2022-10

3. Emerging Memory Structures for VLSI Circuits;Wiley Encyclopedia of Electrical and Electronics Engineering;2022-05-12

4. GIRAF: General Purpose In-Storage Resistive Associative Framework;IEEE Transactions on Parallel and Distributed Systems;2022-02-01

5. Toward a Microarchitecture for Efficient Execution of Irregular Applications;ACM Transactions on Parallel Computing;2020-12

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3