Building a Large Dataset of Human-Generated Captions for Science Diagrams-Reference-Cited by-同舟云学术

Building a Large Dataset of Human-Generated Captions for Science Diagrams

Published:2024 Issue: Volume: Page:393-401
ISSN:0302-9743
Container-title:Lecture Notes in Computer Science
language:en
Short-container-title:

Author:

Sato Yuri^ORCID,Suzuki Ayaka^ORCID,Mineshima Koji^ORCID

Abstract

AbstractHuman-generated captions for photographs, particularly snapshots, have been extensively collected in recent AI research. They play a crucial role in the development of systems capable of multimodal information processing that combines vision and language. Recognizing that diagrams may serve a distinct function in thinking and communication compared to photographs, we shifted our focus from snapshot photographs to diagrams. We provided humans with text-free diagrams and collected data on the captions they generated. The diagrams were sourced from AI2D-RST, a subset of AI2D. This subset annotates the AI2D image dataset of diagrams from elementary school science textbooks with types of diagrams. We mosaicked all textual elements within the diagram images to ensure that human annotators focused solely on the diagram’s visual content when writing a sentence about what the image expresses. For the 831 images in our dataset, we obtained caption data from at least three individuals per image. To the best of our knowledge, this dataset is the first collection of caption data specifically for diagrams.

Publisher

Springer Nature Switzerland

Link

https://link.springer.com/content/pdf/10.1007/978-3-031-71291-3_32

Reference25 articles.

1. Alikhani M, Stone M: Arrows are the verbs of diagrams. In: COLING 2018, pp. 3552–3563. ACL (2018)

2. Berkeley G.: A Treatise Concerning the Principles of Human Knowledge. The Floating Press (1710/2014)

3. Bernardi, R., et al.: Automatic description generation from images: a survey of models, datasets, and evaluation measures. J. Artif. Intell. Res. 55, 409–442 (2016). https://doi.org/10.1613/jair.4900

4. Best, L.A., Smith, L.D., Stubbs, D.A.: Graph use in psychology and other sciences. Behav. Process. 54, 155–165 (2001). https://doi.org/10.1016/S0376-6357(01)00156-5

5. Daston L, Galison P: Objectivity. Zone Books (2007)