The effectiveness of multiple hardware contexts-Reference-Cited by-同舟云学术

The effectiveness of multiple hardware contexts

Published:1994-12 Issue:5 Volume:28 Page:328-337
ISSN:0163-5980
Container-title:ACM SIGOPS Operating Systems Review
language:en
Short-container-title:SIGOPS Oper. Syst. Rev.

Author:

Thekkath Radhika¹,Eggers Susan J.¹

Affiliation:

1. Department of Computer Science and Engineering, FR-35, University of Washington, Seattle, WA

Abstract

Multithreaded processors are used to tolerate long memory latencies. By executing threads loaded in multiple hardware contexts, an otherwise idle processor can keep busy, thus increasing its utilization. However, the larger size of a multi-thread working set can have a negative effect on cache conflict misses. In this paper we evaluate the two phenomena together, examining their combined effect on execution time. The usefulness of multiple hardware contexts depends on: program data locality, cache organization and degree of multiprocessing. Multiple hardware contexts are most effective on programs that have been optimized for data locality. For these programs, execution time dropped with increasing contexts, over widely varying architectures. With unoptimized applications, multiple contexts had limited value. The best performance was seen with only two contexts, and only on uniprocessors and small multiprocessors. The behavior of the unoptimized applications changed more noticeably with variations in cache associativity and cache hierarchy, unlike the optimized programs. As a mechanism for exploiting program parallelism, an additional processor is clearly better than another context. However, there were many configurations for which the addition of a few hardware contexts brought as much or greater performance than a larger multiprocessor with fewer than the optimal number of contexts.

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/381792.195583

Reference26 articles.

1. A. Agarwal. Limits on interconnection network performnce iEEE Transactions on Parallel and Distributed Systms 2(4):398-412 October 1991. 10.1109/71.97897 A. Agarwal. Limits on interconnection network performnce iEEE Transactions on Parallel and Distributed Systms 2(4):398-412 October 1991. 10.1109/71.97897

2. Performance tradeoffs in multithreaded processors

3. APRIL

4. The Tera computer system

5. PRESTO: A system for object-oriented parallel programming

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Increasing resource utilization in mixed-criticality systems using a polymorphic VLIW processor;Journal of Systems Architecture;2018-03

2. A Capacity-Aware Thread Scheduling Method Combined with Cache Partitioning to Reduce Inter-Thread Cache Conflicts;IEICE Transactions on Information and Systems;2013

3. Multi-criteria Checkpointing Strategies: Response-Time versus Resource Utilization;Euro-Par 2013 Parallel Processing;2013