Dirty Pixels: Towards End-to-end Image Processing and Perception

Author:

Diamond Steven1,Sitzmann Vincent2,Julca-Aguilar Frank3ORCID,Boyd Stephen1,Wetzstein Gordon1,Heide Felix4

Affiliation:

1. Stanford University

2. Stanford University, MIT

3. Algolux

4. Princeton University

Abstract

Real-world, imaging systems acquire measurements that are degraded by noise, optical aberrations, and other imperfections that make image processing for human viewing and higher-level perception tasks challenging. Conventional cameras address this problem by compartmentalizing imaging from high-level task processing. As such, conventional imaging involves processing the RAW sensor measurements in a sequential pipeline of steps, such as demosaicking, denoising, deblurring, tone-mapping, and compression. This pipeline is optimized to obtain a visually pleasing image. High-level processing, however, involves steps such as feature extraction, classification, tracking, and fusion. While this silo-ed design approach allows for efficient development, it also dictates compartmentalized performance metrics without knowledge of the higher-level task of the camera system. For example, today’s demosaicking and denoising algorithms are designed using perceptual image quality metrics but not with domain-specific tasks such as object detection in mind. We propose an end-to-end differentiable architecture that jointly performs demosaicking, denoising, deblurring, tone-mapping, and classification (see Figure 1). The architecture does not require any intermediate losses based on perceived image quality and learns processing pipelines whose outputs differ from those of existing ISPs optimized for perceptual quality, preserving fine detail at the cost of increased noise and artifacts. We show that state-of-the-art ISPs discard information that is essential in corner cases, such as extremely low-light conditions, where conventional imaging and perception stacks fail. We demonstrate on captured and simulated data that our model substantially improves perception in low light and other challenging conditions, which is imperative for real-world applications such as autonomous driving, robotics, and surveillance. Finally, we found that the proposed model also achieves state-of-the-art accuracy when optimized for image reconstruction in low-light conditions, validating the architecture itself as a potentially useful drop-in network for reconstruction and analysis tasks beyond the applications demonstrated in this work. Our proposed models, datasets, and calibration data are available at https://github.com/princeton-computational-imaging/DirtyPixels .

Funder

Stanford Graduate Fellowship in Science and Engineering

National Science Foundation (NSF) CAREER award

Sloan Fellowship

PECASE from the ARO

KAUST Office of Sponsored Research through the Visual Computing Center CCF

NSF CAREER Award

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Graphics and Computer-Aided Design

Reference62 articles.

1. A fast iterative shrinkage-thresholding algorithm for linear inverse problems;Beck A.;SIAM J. Imag. Sci.,2009

2. Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems;Beck A.;IEEE Trans. Image Proc.,2009

3. Optimizing image acquisition systems for autonomous driving;Blasinski Henryk;Electron. Imag.,2018

Cited by 19 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Task-Friendly Underwater Image Enhancement for Machine Vision Applications;IEEE Transactions on Geoscience and Remote Sensing;2024

2. Joint denoising and classification network: Application to microseismic event detection in hydraulic fracturing distributed acoustic sensing monitoring;GEOPHYSICS;2023-07-01

3. Rawgment: Noise-Accounted RAW Augmentation Enables Recognition in a Wide Variety of Environments;2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR);2023-06

4. Towards Low-Cost Learning-based Camera ISP via Unrolled Optimization;2023 20th Conference on Robots and Vision (CRV);2023-06

5. Instance Segmentation in the Dark;International Journal of Computer Vision;2023-05-26

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3