Author:
Zhou Honghui,Qin Ruyi,Liu Zihan,Qian Ying,Ju Xiaoming
Abstract
AbstractThe application of machine learning algorithms in the field of power grid improves the service level of power enterprises and promotes the development of power grid. NVIDIA Volta and Turing GPUs powered by Tensor Cores can accelerate training and learning performance for these algorithms. With Tensor Cores enabled, FP32 and FP16 mixed precision matrix multiplication dramatically accelerates the throughput and reduces AI training times. In order to explore the cause of this phenomenon, we choose a convolutional neural network (CNN), which is widely used in computer vision, as an example and show the performance characteristics with tensor core on general matrix multiplications and convolution calculations as benchmark. Building a CNN based on cuDNN and TensorFlow, we analyze the performance of CNN from various aspects and optimize performance of it by changing the shape of convolution kernel and using texture memory, etc. The experimental results prove the effectiveness of our methods.
Publisher
Springer Nature Singapore
Reference23 articles.
1. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
2. Abdel-Hamid, O., Mohamed, A., Jiang, H., et al.: Convolutional neural networks for speech recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 22(10), 1533–1545 (2014)
3. Conneau, A., Schwenk, H., Barrault, L., et al.: Very deep convolutional networks for natural language processing. arXiv preprint arXiv:1606.01781, February 2016
4. Segler, M.H.S., Kogej, T., Tyrchan, C., et al.: Generating focused molecule libraries for drug discovery with recurrent neural networks. ACS Cent. Sci. 4(1), 120–131 (2017)
5. NVIDIA: Nvidia turing architecture whitepaper. Technical report, NVIDIA Corp., August 2018. https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/technologies/turing-architecture/NVIDIA-Turing-Architecture-Whitepaper.pdf