Affiliation:
1. Research Scholar Department of Electronics & Telecommunication, Matoshri College of Engineering & Research Centre, Nashik, Savitribai Phule Pune University, Pune, India
Abstract
Imaging sensors with higher resolution and higher frame rates are becoming more popular for wide-area video surveillance (VS) and other applications as technology advances Using Mask-RCNN, we proposed Multiple-Object Detection and Segmentation in High-Resolution Video based on Deep Learning. The ResNet-50 ResNet-101 is used as the backbone in the proposed R-CNN Mask FPN model. The deep residual network’s design overcomes the problem of lower learning efficiency due to the network’s deepening. To reach the objective of the smallest overall error, the deep residual network divided the training series into one training block, minimizing the error of each block. It is roughly divided into five convolutional layer stages. The output scale is cut in half at each point. We used mixed precision FP16 and FP32 for training the model and achieved great speed in training time reduction in inference time for object. The COCO 2014 data set is used to train and validate the proposed model with mixed precision, leading to faster performance. The results of the experiments show that the proposed model can run at 30–48 frames per second with 85% accuracy.
Funder
National Natural Science Foundation of China
Publisher
World Scientific Pub Co Pte Ltd
Subject
Artificial Intelligence,Computer Vision and Pattern Recognition,Software
Cited by
12 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献