Deep Learning Distributed Architecture Design Implementation for Computer Vision

Author:

Zhang Yizhong1ORCID

Affiliation:

1. Software Engineering Institute, East China Normal University, Shanghai 200062, China

Abstract

In the era of big data, to achieve an efficient deep learning and computer vision system for big data, developers need to build a computerized deep learning and computer vision system, and the system can simultaneously complete the tasks of deep learning and computer vision and large-scale data processing. The existing training dataset is reused, and the scene information is small, which cannot meet the needs of large-scale machine training, so it is necessary to include large-scale data, distributed computer system to complete the training. How to meet the training accuracy requirements of deep learning models and minimize the resource cost within the constrained time is a major challenge for distributed deep learning systems. Resource and batch size hyperparameter allocation are the main approaches to optimize the training accuracy and resource cost of models. Existing works have independently configured resources and batch size hyperparameters in terms of computational efficiency and training accuracy, respectively. However, the impact of the two types of configurations on model training accuracy and resource cost has complex dependencies, and it is difficult to achieve the goals of satisfying the model training accuracy requirements and minimizing the resource cost simultaneously by the existing independent configuration methods. To address these problems, this paper proposes a collaborative resource-batch size optimization configuration method for distributed deep learning systems. This method was firstly based on the monotonic function relationship between resource allocation and batch size hyperparameter allocation and model training time and training accuracy, and we select the order-preserving regression theoretical tool to build a model prediction model for single-round complete training time and final training accuracy for computer vision target classification and recognition, respectively; then, we use the abovementioned models together to solve the resource and batch size optimal allocation solutions to meet the model training accuracy requirements with the goal of minimizing resource cost. The optimal allocation of resources and batch size to meet the training accuracy requirements of the model is solved. In this paper, we evaluate the performance of the proposed method for computer vision target recognition based on the proposed distributed deep learning system.

Publisher

Hindawi Limited

Subject

Electrical and Electronic Engineering,Computer Networks and Communications,Information Systems

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3