Algorithm-hardware Co-optimization for Energy-efficient Drone Detection on Resource-constrained FPGA-Reference-Cited by-同舟云学术

Algorithm-hardware Co-optimization for Energy-efficient Drone Detection on Resource-constrained FPGA

Published:2023-05-10 Issue:2 Volume:16 Page:1-25
ISSN:1936-7406
Container-title:ACM Transactions on Reconfigurable Technology and Systems
language:en
Short-container-title:ACM Trans. Reconfigurable Technol. Syst.

Author:

Suh Han-Sok¹^ORCID,Meng Jian¹^ORCID,Nguyen Ty²^ORCID,Kumar Vijay²^ORCID,Cao Yu¹^ORCID,Seo Jae-Sun¹^ORCID

Affiliation:

1. Arizona State University, Tempe, AZ, USA

2. University of Pennsylvania, Philadelphia, PA, USA

Abstract

Convolutional neural network (CNN)-based object detection has achieved very high accuracy; e.g., single-shot multi-box detectors (SSDs) can efficiently detect and localize various objects in an input image. However, they require a high amount of computation and memory storage, which makes it difficult to perform efficient inference on resource-constrained hardware devices such as drones or unmanned aerial vehicles (UAVs). Drone/UAV detection is an important task for applications including surveillance, defense, and multi-drone self-localization and formation control. In this article, we designed and co-optimized an algorithm and hardware for energy-efficient drone detection on resource-constrained FPGA devices. We trained an SSD object detection algorithm with a custom drone dataset. For inference, we employed low-precision quantization and adapted the width of the SSD CNN model. To improve throughput, we use dual-data rate operations for DSPs to effectively double the throughput with limited DSP counts. For different SSD algorithm models, we analyze accuracy or mean average precision (mAP) and evaluate the corresponding FPGA hardware utilization, DRAM communication, and throughput optimization. We evaluated the FPGA hardware for a custom drone dataset, Pascal VOC, and COCO2017. Our proposed design achieves a high mAP of 88.42% on the multi-drone dataset, with a high energy efficiency of 79 GOPS/W and throughput of 158 GOPS using the Xilinx Zynq ZU3EG FPGA device on the Open Vision Computer version 3 (OVC3) platform. Our design achieves 1.1 to 8.7× higher energy efficiency than prior works that used the same Pascal VOC dataset, using the same FPGA device, but at a low-power consumption of 2.54 W. For the COCO dataset, our MobileNet-V1 implementation achieved an mAP of 16.8, and 4.9 FPS/W for energy-efficiency, which is ∼ 1.9× higher than prior FPGA works or other commercial hardware platforms.

Funder

NSF

DARPA

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science

Link

https://dl.acm.org/doi/pdf/10.1145/3583074

Reference26 articles.

1. Efficient Real-Time Object Detection based on Convolutional Neural Network

2. Mobilenet-SSDv2: An Improved Object Detection Model for Embedded Systems

3. Jungwook Choi, Swagath Venkataramani, Vijayalakshmi Srinivasan, Kailash Gopalakrishnan, Zhuo Wang, and Pierce Chuang. 2019. Accurate and efficient 2-bit quantized neural networks. In Conference on Machine Learning and Systems (MLSys’19).

4. RepVGG: Making VGG-style ConvNets Great Again

5. Tinier-YOLO: A Real-Time Object Detection Method for Constrained Environments

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Fusion flow-enhanced graph pooling residual networks for Unmanned Aerial Vehicles surveillance in day and night dual visions;Engineering Applications of Artificial Intelligence;2024-10

2. Energy-Efficient Computing Acceleration of Unmanned Aerial Vehicles Based on a CPU/FPGA/NPU Heterogeneous System;IEEE Internet of Things Journal;2024-08-15

3. An ASIC Accelerator for QNN With Variable Precision and Tunable Energy Efficiency;IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems;2024-07