1. NVIDIA. CUDA Programming Guide 2017. NVIDIA. CUDA Programming Guide 2017.
2. Yuxi Liu Zhibin Yu Lieven Eeckhout Vijay Janapa Reddi Yingwei Luo Xiaolin Wang Zhenlin Wang and Chengzhong Xu. Barrier-Aware Warp Scheduling for Throughput Processors. In ICS-16. ACM. 10.1145/2925426.2926267 Yuxi Liu Zhibin Yu Lieven Eeckhout Vijay Janapa Reddi Yingwei Luo Xiaolin Wang Zhenlin Wang and Chengzhong Xu. Barrier-Aware Warp Scheduling for Throughput Processors. In ICS-16. ACM. 10.1145/2925426.2926267
3. Gregory Diamos Benjamin Ashbaugh Subramaniam Maiyuran Andrew Kerr Haicheng Wu and Sudhakar Yalamanchili. SIMD re-convergence at thread frontiers. In MICRO-11. ACM. 10.1145/2155620.2155676 Gregory Diamos Benjamin Ashbaugh Subramaniam Maiyuran Andrew Kerr Haicheng Wu and Sudhakar Yalamanchili. SIMD re-convergence at thread frontiers. In MICRO-11. ACM. 10.1145/2155620.2155676