A study of the evaluation metrics for generative images containing combinational creativity-Reference-Cited by-同舟云学术

A study of the evaluation metrics for generative images containing combinational creativity

Published:2023 Issue: Volume:37 Page:
ISSN:0890-0604
Container-title:Artificial Intelligence for Engineering Design, Analysis and Manufacturing
language:en
Short-container-title:AIEDAM

Author:

Wang Boheng,Zhu Yunhuai,Chen Liuqing^ORCID,Liu Jingcheng,Sun Lingyun,Childs Peter^ORCID

Abstract

AbstractIn the field of content generation by machine, the state-of-the-art text-to-image model, DALL⋅E, has advanced and diverse capacities for the combinational image generation with specific textual prompts. The images generated by DALL⋅E seem to exhibit an appreciable level of combinational creativity close to that of humans in terms of visualizing a combinational idea. Although there are several common metrics which can be applied to assess the quality of the images generated by generative models, such as IS, FID, GIQA, and CLIP, it is unclear whether these metrics are equally applicable to assessing images containing combinational creativity. In this study, we collected the generated image data from machine (DALL⋅E) and human designers, respectively. The results of group ranking in the Consensual Assessment Technique (CAT) and the Turing Test (TT) were used as the benchmarks to assess the combinational creativity. Considering the metrics’ mathematical principles and different starting points in evaluating image quality, we introduced coincident rate (CR) and average rank variation (ARV) which are two comparable spaces. An experiment to calculate the consistency of group ranking of each metric by comparing the benchmarks then was conducted. By comparing the consistency results of CR and ARV on group ranking, we summarized the applicability of the existing evaluation metrics in assessing generative images containing combinational creativity. In the four metrics, GIQA performed the closest consistency to the CAT and TT. It shows the potential as an automated assessment for images containing combinational creativity, which can be used to evaluate the images containing combinational creativity in the relevant task of design and engineering such as conceptual sketch, digital design image, and prototyping image.

Funder

National Natural Science Foundation of China

Publisher

Cambridge University Press (CUP)

Subject

Artificial Intelligence,Industrial and Manufacturing Engineering

Reference47 articles.

1. Evaluation of coco validation 2017 dataset with yolov3;Kim;Evaluation,2019

2. An artificial intelligence based data-driven approach for design ideation

3. The Nature of Human Creativity

4. Zhang, H , Yin, W , Fang, Y , Li, L , Duan, B , Wu, Z , … and Wang, H (2021) ERNIE-ViLG: unified generative pre-training for bidirectional vision-language generation. arXiv preprint arXiv:2112.15283.

5. Cognition and Creativity

Cited by 7 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. An artificial intelligence approach for interpreting creative combinational designs;Journal of Engineering Design;2024-07-11

2. Combinediff: a GenAI creative support tool for image combination exploration;Journal of Engineering Design;2024-05-31

3. A foundation model enhanced approach for generative design in combinational creativity;Journal of Engineering Design;2024-05-28

4. Research on interference compensation methods for color image sensors based on iterative learning;Sensor Review;2024-03-14

5. A knowledge graph-based bio-inspired design approach for knowledge retrieval and reasoning;Journal of Engineering Design;2024-01-31