1. Patrick Atkinson and Simon McIntosh-Smith . 2017. On the Performance of Parallel Tasking Runtimes for an Irregular Fast Multipole Method Application . In Scaling OpenMP for Exascale Performance and Portability, Bronis R. de Supinski, Stephen L . Olivier, Christian Terboven, Barbara M. Chapman, and Matthias S. Müller (Eds.). Springer International Publishing , Cham , 92–106. Patrick Atkinson and Simon McIntosh-Smith. 2017. On the Performance of Parallel Tasking Runtimes for an Irregular Fast Multipole Method Application. In Scaling OpenMP for Exascale Performance and Portability, Bronis R. de Supinski, Stephen L. Olivier, Christian Terboven, Barbara M. Chapman, and Matthias S. Müller (Eds.). Springer International Publishing, Cham, 92–106.
2. StarPU: a unified platform for task scheduling on heterogeneous multicore architectures
3. Optimized Broadcast for Deep Learning Workloads on Dense-GPU InfiniBand Clusters
4. Alan Ayala , Stanimire Tomov , Miroslav Stoyanov , and Jack Dongarra . 2021. Scalability Issues in FFT Computation . In Parallel Computing Technologies, Victor Malyshkin (Ed.). Springer International Publishing , Cham , 279–287. Alan Ayala, Stanimire Tomov, Miroslav Stoyanov, and Jack Dongarra. 2021. Scalability Issues in FFT Computation. In Parallel Computing Technologies, Victor Malyshkin (Ed.). Springer International Publishing, Cham, 279–287.
5. UPC++: A High-Performance Communication Framework for Asynchronous Computation