Affiliation:
1. University of Hannover, Germany
Abstract
A scalable, distributed, processor architecture is presented that emphasizes on high performance computing for digital signal processing applications by combining high frequency design techniques with a very high degree of parallel processing on a chip. The architecture is based on a superscalar processor model with a modified Tomasulo scheme [1], that was extended to eliminate all central control structures for the data flow and to support simultaneous instruction issue from multiple independent threads (SMT). Consequent application of fine clustering reduces the cycle-time for wire-sensitive building blocks of the processor like the register file or the instruction scheduler and leads to a distributed architecture model, where independent thread processing units, ALUs, registers files and memories are distributed across the chip and communicate with each other by special networks. The performance of the architecture is scalable with both the number of function units and the number of thread units without having any impact on the processors cycle-time.
Publisher
Association for Computing Machinery (ACM)
Reference61 articles.
1. An Efficient Algorithm for Exploiting Multiple Arithmetic Units
2. 2001 technology roadmap for semiconductors
3. M. H. Lipasti and J. P. Shen "Modern Processor Design" McGrawHill 2002. M. H. Lipasti and J. P. Shen "Modern Processor Design" McGrawHill 2002.
4. Complexity-effective superscalar processors
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献