WYTIWYR: A User Intent‐Aware Framework with Multi‐modal Inputs for Visualization Retrieval-Reference-Cited by-同舟云学术

WYTIWYR: A User Intent‐Aware Framework with Multi‐modal Inputs for Visualization Retrieval

Published:2023-06 Issue:3 Volume:42 Page:311-322
ISSN:0167-7055
Container-title:Computer Graphics Forum
language:en
Short-container-title:Computer Graphics Forum

Author:

Xiao Shishi¹,Hou Yihan¹,Jin Cheng¹^ORCID,Zeng Wei¹²^ORCID

Affiliation:

1. The Hong Kong University of Science and Technology (Guangzhou) Guangzhou China

2. The Hong Kong University of Science and Technology Hong Kong SAR China

Abstract

AbstractRetrieving charts from a large corpus is a fundamental task that can benefit numerous applications such as visualization recommendations. The retrieved results are expected to conform to both explicit visual attributes (e.g., chart type, colormap) and implicit user intents (e.g., design style, context information) that vary upon application scenarios. However, existing example‐based chart retrieval methods are built upon non‐decoupled and low‐level visual features that are hard to interpret, while definition‐based ones are constrained to pre‐defined attributes that are hard to extend. In this work, we propose a new framework, namely WYTIWYR (What‐You‐Think‐Is‐What‐You‐Retrieve), that integrates user intents into the chart retrieval process. The framework consists of two stages: first, the Annotation stage disentangles the visual attributes within the query chart; and second, the Retrieval stage embeds the user's intent with customized text prompt as well as bitmap query chart, to recall targeted retrieval result. We develop aprototype WYTIWYR system leveraging a contrastive language‐image pre‐training (CLIP) model to achieve zero‐shot classification as well as multi‐modal input encoding, and test the prototype on a large corpus with charts crawled from the Internet. Quantitative experiments, case studies, and qualitative interviews are conducted. The results demonstrate the usability and effectiveness of our proposed framework.

Funder

National Natural Science Foundation of China

Hong Kong University of Science and Technology

Publisher

Wiley

Subject

Computer Graphics and Computer-Aided Design

Link

https://onlinelibrary.wiley.com/doi/pdf/10.1111/cgf.14832

Reference57 articles.

1. Quality Metrics for Information Visualization

2. BattleL. DuanP. MirandaZ. MukushevaD. ChangR. StonebrakerM.: Beagle: Automated extraction and interpretation of visualizations from the web. InProc. ACM CHI(2018) pp.594:1–8. 2 3 7

3. BattleL. FengD. WebberK.: Exploring visualization implementation challenges faced by D3 users online.arXiv preprint arXiv:2108.02299(2021). 1

4. BakoH. K. LiuX. BattleL. LiuZ.: Understanding how designers find and use data visualization examples.IEEE Trans. Vis. Comput. Graph. (2022). 7

5. BrownT. MannB. RyderN. SubbiahM. KaplanJ. D. DhariwalP. NeelakantanA. ShyamP. SastryG. AskellA. et al.: Language models are few‐shot learners. InProc. NIPS(2020) pp.1877–1901. 3

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Generative AI for visualization: State of the art and future directions;Visual Informatics;2024-06

2. Let the Chart Spark: Embedding Semantic Context into Chart with Text-to-Image Generative Model;IEEE Transactions on Visualization and Computer Graphics;2023