Affiliation:
1. IBM Research AI Zurich Switzerland
Abstract
AbstractThe latest successes in AI have been largely driven by a paradigm known as Foundation Models (FMs), large Neural Networks pretrained on massive datasets that thereby acquire impressive transfer learning capabilities to adapt to new tasks. The emerging properties of FMs have unlocked novel tantalizing applications for instance enabling the generation of fluent text and realistic images from text descriptions. The impact of FMs on technical domains like civil engineering is however still in its infancy, owing to a gap between research development and application use cases. This paper aims to help bridge this gap and promote adoption among technical practitioners, specifically in visual inspection applications for civil engineering. For that we analyze the requirements in terms of data availability making particular use cases amenable to the pretraining/fine‐tuning paradigm of FMs, i.e. situations where labeled data is scarce or costly, but unlabeled data is abundant. We then illustrate proof‐of‐concepts workflows using FMs, in visual inspection applications. We hope that our contribution will mark the start of conversations between AI researchers and civil engineers on the potential of FMs to accelerate workflows supporting vision tasks for maintenance inspections and decisions.
Funder
Horizon 2020 Framework Programme
Subject
General Earth and Planetary Sciences,General Environmental Science
Reference50 articles.
1. Deep learning
2. T. B.Brownet al. “Language Models are Few‐Shot Learners.” arXiv Jul.2020. doi:10.48550/arXiv.2005.14165.
3. A.Rameshet al. “Zero‐Shot Text‐to‐Image Generation.” arXiv Feb.2021. doi:10.48550/arXiv.2102.12092.
4. C.Sahariaet al. “Photorealistic Text‐to‐Image Diffusion Models with Deep Language Understanding.” arXiv May2022. doi:10.48550/arXiv.2205.11487.
5. R.Rombach A.Blattmann D.Lorenz P.Esser andB.Ommer “High‐Resolution Image Synthesis with Latent Diffusion Models.” arXiv Apr.2022. doi:10.48550/arXiv.2112.10752.