Affiliation:
1. Shanghai Jiao Tong University Shanghai China
2. IBM Research Cambridge Massachusetts USA
3. Institute of Geographic Sciences and Natural Resources Research Chinese Academy of Sciences Beijing China
Abstract
AbstractWith the rapid development of big data science, the research paradigm in the field of geosciences has also begun to shift to big data‐driven scientific discovery. Researchers need to read a huge amount of literature to locate, extract and aggregate relevant results and data that are published and stored in PDF format for building a scientific database to support the big data‐driven discovery. In this paper, based on the findings of a study about how geoscientists annotate literature and extract and aggregate data, we proposed GeoDeepShovel, a publicly available AI‐assisted data extraction system to support their needs. GeoDeepShovel leverages state‐of‐the‐art neural network models to support researcher(s) easily and accurately annotate papers (in the PDF format) and extract data from tables, figures, maps, etc., in a human–AI collaboration manner. As a part of the Deep‐Time Digital Earth (DDE) program, GeoDeepShovel has been deployed for 8 months, and there are already 400 users from 44 geoscience research teams within the DDE program using it to construct scientific databases on a daily basis, and more than 240 projects and 50,000 documents have been processed for building scientific databases.
Funder
National Natural Science Foundation of China
Subject
General Earth and Planetary Sciences
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献