Automated Construction of a Photocatalysis Dataset for Water-Splitting Applications-Reference-Cited by-同舟云学术

Automated Construction of a Photocatalysis Dataset for Water-Splitting Applications

Published:2023-09-22 Issue:1 Volume:10 Page:
ISSN:2052-4463
Container-title:Scientific Data
language:en
Short-container-title:Sci Data

Author:

Isazawa Taketomo^ORCID,Cole Jacqueline M.^ORCID

Abstract

AbstractWe present an automatically generated dataset of 15,755 records that were extracted from 47,357 papers. These records contain water-splitting activity in the presence of certain photocatalysts, along with additional information about the chemical reaction conditions under which this activity was recorded. These conditions include any co-catalysts and additives that were present during water splitting, the length of time for which the photocatalytic experiment was conducted, and the type of light source used, including its wavelength. Despite the text extraction of such a wide range of chemical reaction attributes, the dataset afforded good precision (71.2%) and recall (36.3%). These figures-of-merit were calculated based on a random sample of open-access papers from the corpus. Mining such a complex set of attributes required the development of novel techniques in knowledge extraction and interdependency resolution, leveraging inter- and intra-sentence relations, which are also described in this paper. We present a new version (version 2.2) of the chemistry-aware text-mining toolkit ChemDataExtractor, in which these new techniques are included.

Funder

Royal Academy of Engineering

BASF

RCUK | Science and Technology Facilities Council

Publisher

Springer Science and Business Media LLC

Subject

Library and Information Sciences,Statistics, Probability and Uncertainty,Computer Science Applications,Education,Information Systems,Statistics and Probability

Link

https://www.nature.com/articles/s41597-023-02511-6.pdf

Reference26 articles.

1. Mai, H., Le, T. C., Chen, D., Winkler, D. A. & Caruso, R. A. Machine learning for electrocatalyst and photocatalyst design and discovery. Chemical Reviews 122, 13478–13515 (2022).

2. Jin, H. et al. Data-driven systematic search of promising photocatalysts for water splitting under visible light. Journal of Physical Chemistry Letters 10, 5211–5218 (2019).

3. Zhang, R., Liu, X., Wen, Z. & Jiang, Q. Prediction of silicon nanowires as photocatalysts for water splitting: band structures calculated using density functional theory. Journal of Physical Chemistry C 115, 3425–3428 (2011).

4. Jain, A. et al. Commentary: The materials project: A materials genome approach to accelerating materials innovation. APL Materials 1, 011002 (2013).

5. Cao, S., Piao, L. & Chen, X. Emerging photocatalysts for hydrogen evolution. Trends in Chemistry 2, 57–70 (2020).

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Versatile Deep Learning Pipeline for Transferable Chemical Data Extraction;Journal of Chemical Information and Modeling;2024-07-15

2. How Beneficial Is Pretraining on a Narrow Domain-Specific Corpus for Information Extraction about Photocatalytic Water Splitting?;Journal of Chemical Information and Modeling;2024-03-28

3. Materials science in the era of large language models: a perspective;Digital Discovery;2024

4. Artificial intelligence (AI) futures: India-UK collaborations emerging from the 4th Royal Society Yusuf Hamied workshop;International Journal of Information Management;2023-11