1. Tensorflow: large-scale machine learning on heterogeneous distributed systems;Abadi,2016
2. Caffe: convolutional architecture for fast feature embedding;Jia,2014
3. Enabling POCL-based runtime frameworks on the HSA for OpenCL 2.0 support;Chang;J. Syst. Archit.,2017
4. Viennacl++: enable tensorflow/eigen via viennacl with OpenCL C++ flow;Chen,2018
5. Opencl vector swizzling optimization under LLVM global value numbering;Her,2018