Author:
Balaprakash Prasanna,Wild Stefan M.,Hovland Paul D.
Reference22 articles.
1. Optimization of a lattice Boltzmann computation on state-of-the-art multicore platforms;Williams;Journal of Parallel and Distributed Computing,2009
2. Optimization and performance modeling of stencil computations on modern microprocessors;Datta;SIAM Review,2009
3. Optimization of sparse matrix-vector multiplication on emerging multicore platforms;Williams;Parallel Computing,2009
4. Autotuning and specialization: Speeding up matrix multiply for small matrices with compiler technology, in:;Shin,2009
5. Automatically tuned linear algebra software, in:;Whaley,1998
Cited by
22 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Performance Tuning for GPU-Embedded Systems: Machine-Learning-Based and Analytical Model-Driven Tuning Methodologies;2023 IEEE 35th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD);2023-10-17
2. A taxonomy of constraints in black-box simulation-based optimization;Optimization and Engineering;2023-09-09
3. ML-based Performance Portability for Time-Dependent Density Functional Theory in HPC Environments;2022 IEEE/ACM International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS);2022-11
4. AlphaSparse: Generating High Performance SpMV Codes Directly from Sparse Matrices;SC22: International Conference for High Performance Computing, Networking, Storage and Analysis;2022-11
5. Using hardware performance counters to speed up autotuning convergence on GPUs;Journal of Parallel and Distributed Computing;2022-02