Affiliation:
1. Ghent University, Ghent, Belgium
2. Advanced Micro Devices, Sunnyvale, CA
3. University of Wisconsin -- Madison, Madison, WI
Abstract
A mechanistic model for out-of-order superscalar processors is developed and then applied to the study of microarchitecture resource scaling. The model divides execution time into intervals separated by disruptive miss events such as branch mispredictions and cache misses. Each type of miss event results in characterizable performance behavior for the execution time interval. By considering an interval's type and length (measured in instructions), execution time can be predicted for the interval. Overall execution time is then determined by aggregating the execution time over all intervals. The mechanistic model provides several advantages over prior modeling approaches, and, when estimating performance, it differs from detailed simulation of a 4-wide out-of-order processor by an average of 7%.
The mechanistic model is applied to the general problem of resource scaling in out-of-order superscalar processors. First, we use the model to determine size relationships among microarchitecture structures in a balanced processor design. Second, we use the mechanistic model to study scaling of both pipeline depth and width in balanced processor designs. We corroborate previous results in this area and provide new results. For example, we show that at optimal design points, the pipeline depth times the square root of the processor width is nearly constant. Finally, we consider the behavior of unbalanced, overprovisioned processor designs based on insight gained from the mechanistic model. We show that in certain situations an overprovisioned processor may lead to improved overall performance. Designs where a processor's dispatch width is wider than its issue width are of particular interest.
Publisher
Association for Computing Machinery (ACM)
Reference43 articles.
1. Clock rate versus IPC
2. Fast data-locality profiling of native execution
3. Brooks D. Martonosi M. and Bose P. 2000. Abstraction via separable components: An empirical study of absolute and relative accuracy in processor performance modeling. Tech. rep. RC 21909 IBM Research Division T. J. Watson Research Center. December. Brooks D. Martonosi M. and Bose P. 2000. Abstraction via separable components: An empirical study of absolute and relative accuracy in processor performance modeling. Tech. rep. RC 21909 IBM Research Division T. J. Watson Research Center. December.
4. Burger D. C. and Austin T. M. 1997. The SimpleScalar tool set. Comput. Architecture News. See also http://www.simplescalar.com for more information. 10.1145/268806.268810 Burger D. C. and Austin T. M. 1997. The SimpleScalar tool set. Comput. Architecture News. See also http://www.simplescalar.com for more information. 10.1145/268806.268810
Cited by
98 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献