1. Andy Adinets. 2014. CUDA Pro Tip: Optimized Filtering with Warp-Aggregated Atomics. NVIDIA. https://developer.nvidia.com/blog/cuda-pro-tip-optimized-filtering-warp-aggregated-atomics/
2. Fast k-Selection Algorithms for Graphics Processing Units;Alabi Tolu;ACM J. Exp. Algorithmics 17, Article,2012
3. Billion-Scale Similarity Search with GPUs
4. Wouter Kool, Herke van Hoof, and Max Welling. 2019. Stochastic Beams and Where To Find Them: The Gumbel-Top-k Trick for Sampling Sequences Without Replacement. In Proceedings of the 36th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 97), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.). PMLR, Long Beach, California, USA, 3499--3508.
5. Efficient Top-K Query Processing on Massively Parallel Hardware