1. Adinets, A., Merrill, D.: Onesweep: a faster least significant digit radix sort for gpus. arXiv preprint arXiv:2206.01784 (2022). https://doi.org/10.48550/arXiv.2206.01784
2. Blelloch, G.E.: Prefix sums and their applications. Tech. Rep. CMU-CS-90-190, School of Computer Science, Carnegie Mellon University (1990)
3. Burnus, T.: Offloading support in GCC (2023). https://gcc.gnu.org/wiki/Offloading. Accessed 17 May 2023
4. Center for Science: LUMI-G documentation, GPU nodes. https://docs.lumi-supercomputer.eu/hardware/lumig/ (2023). Accessed 15 May 2023
5. Lecture Notes in Computer Science;B Chapman,2021