Neural Architecture Search for Dense Prediction Tasks in Computer Vision-Reference-Cited by-同舟云学术

Neural Architecture Search for Dense Prediction Tasks in Computer Vision

Published:2023-04-15 Issue:7 Volume:131 Page:1784-1807
ISSN:0920-5691
Container-title:International Journal of Computer Vision
language:en
Short-container-title:Int J Comput Vis

Author:

Mohan Rohit,Elsken Thomas,Zela Arber,Metzen Jan Hendrik,Staffler Benedikt,Brox Thomas,Valada Abhinav,Hutter Frank

Abstract

AbstractThe success of deep learning in recent years has lead to a rising demand for neural network architecture engineering. As a consequence, neural architecture search (NAS), which aims at automatically designing neural network architectures in a data-driven manner rather than manually, has evolved as a popular field of research. With the advent of weight sharing strategies across architectures, NAS has become applicable to a much wider range of problems. In particular, there are now many publications for dense prediction tasks in computer vision that require pixel-level predictions, such as semantic segmentation or object detection. These tasks come with novel challenges, such as higher memory footprints due to high-resolution data, learning multi-scale representations, longer training times, and more complex and larger neural architectures. In this manuscript, we provide an overview of NAS for dense prediction tasks by elaborating on these novel challenges and surveying ways to address them to ease future research and application of existing methods to novel problems.

Funder

Albert-Ludwigs-Universität Freiburg im Breisgau

Publisher

Springer Science and Business Media LLC

Subject

Artificial Intelligence,Computer Vision and Pattern Recognition,Software

Link

https://link.springer.com/content/pdf/10.1007/s11263-023-01785-y.pdf

Reference164 articles.

1. Abdelfattah, M. S., Mehrotra, A., Dudziak, Ł., & Lane, N. D. (2021). Zero-cost proxies for lightweight NAS. In International conference on learning representations. https://openreview.net/forum?id=0cmMMy8J5q

2. Bahdanau, D., Cho, K., & Bengio, Y. (2015). Neural machine translation by jointly learning to align and translate. In International conference on learning representations.

3. Baker, B., Gupta, O., Naik, N., & Raskar, R. (2017a). Designing neural network architectures using reinforcement learning. In ICLR.

4. Baker, B., Gupta, O., Raskar, R., & Naik, N. (2017b). Accelerating neural architecture search using performance prediction. In NIPS workshop on meta-learning.

5. Behley, J., Garbade, M., Milioto, A., Quenzel, J., Behnke, S., Stachniss, C., & Gall, J. (2019). Semantickitti: A dataset for semantic scene understanding of lidar sequences. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 9297–9307).

Cited by 6 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Training-Free Transformer Architecture Search With Zero-Cost Proxy Guided Evolution;IEEE Transactions on Pattern Analysis and Machine Intelligence;2024-10

2. NAS-ASDet: An adaptive design method for surface defect detection network using neural architecture search;Advanced Engineering Informatics;2024-08

3. Syn-Mediverse: A Multimodal Synthetic Dataset for Intelligent Scene Understanding of Healthcare Facilities;IEEE Robotics and Automation Letters;2024-08

4. Colorizing Multi-Modal Medical Data: An Autoencoder-based Approach for Enhanced Anatomical Information in X-ray Images;EAI Endorsed Transactions on Pervasive Health and Technology;2024-03-25

5. Discretization of a mathematical model for image analysis based on the optics of spiral beams;COMPUT OPT;2024