1. Using shared memory in CUDA C/C++, April 2017. https://devblogs.nvidia.com/parallelforall/using-shared-memory-cuda-cc/
2. Edwards, H.C., Trott, C., Sunderland, D.: Kokkos, a manycore device performance portability library for C++ HPC applications, March 2014. http://on-demand.gputechconf.com/gtc/2014/presentations/S4213-kokkos-manycore-device-perf-portability-library-hpc-apps.pdf
3. Grinberg, L., Bertolli, C., Haque, R.: Hands on with openmp4.5 and unified memory: developing applications for IBM’S hybrid CPU + GPU systems (part I). Submitted for IWOMP 2017
4. CUDA C/C++ programming guide - shared memory section, April 2017. http://docs.nvidia.com/cuda/cuda-c-programming-guide/#shared-memory
5. OpenMP Language Committee: OpenMP Application Program Interface, version 4.5 edn., July 2013. http://www.openmp.org/mp-documents/openmp-4.5.pdf