Towards Workflows for the Use of AI Foundation Models in Visual Inspection Applications-Reference-Cited by-同舟云学术

Towards Workflows for the Use of AI Foundation Models in Visual Inspection Applications

Published:2023-09 Issue:5 Volume:6 Page:605-613
ISSN:2509-7075
Container-title:ce/papers
language:en
Short-container-title:ce papers

Author:

Rigotti Mattia¹,Antognini Diego¹,Assaf Roy¹,Bakirci Kagan¹,Frick Thomas¹,Giurgiu Ioana¹,Janoušková Klára¹,Janicki Filip¹,Jubran Husam¹,Malossi Cristiano¹,Meterez Alexandru¹,Scheidegger Florian

Affiliation:

1. IBM Research AI Zurich Switzerland

Abstract

AbstractThe latest successes in AI have been largely driven by a paradigm known as Foundation Models (FMs), large Neural Networks pretrained on massive datasets that thereby acquire impressive transfer learning capabilities to adapt to new tasks. The emerging properties of FMs have unlocked novel tantalizing applications for instance enabling the generation of fluent text and realistic images from text descriptions. The impact of FMs on technical domains like civil engineering is however still in its infancy, owing to a gap between research development and application use cases. This paper aims to help bridge this gap and promote adoption among technical practitioners, specifically in visual inspection applications for civil engineering. For that we analyze the requirements in terms of data availability making particular use cases amenable to the pretraining/fine‐tuning paradigm of FMs, i.e. situations where labeled data is scarce or costly, but unlabeled data is abundant. We then illustrate proof‐of‐concepts workflows using FMs, in visual inspection applications. We hope that our contribution will mark the start of conversations between AI researchers and civil engineers on the potential of FMs to accelerate workflows supporting vision tasks for maintenance inspections and decisions.

Funder

Horizon 2020 Framework Programme

Publisher

Wiley

Subject

General Earth and Planetary Sciences,General Environmental Science

Link

https://onlinelibrary.wiley.com/doi/pdf/10.1002/cepa.2141

Reference50 articles.

1. Deep learning

2. T. B.Brownet al. “Language Models are Few‐Shot Learners.” arXiv Jul.2020. doi:10.48550/arXiv.2005.14165.

3. A.Rameshet al. “Zero‐Shot Text‐to‐Image Generation.” arXiv Feb.2021. doi:10.48550/arXiv.2102.12092.

4. C.Sahariaet al. “Photorealistic Text‐to‐Image Diffusion Models with Deep Language Understanding.” arXiv May2022. doi:10.48550/arXiv.2205.11487.

5. R.Rombach A.Blattmann D.Lorenz P.Esser andB.Ommer “High‐Resolution Image Synthesis with Latent Diffusion Models.” arXiv Apr.2022. doi:10.48550/arXiv.2112.10752.