Author:
Schubotz Moritz,Satpute Ankit,Greiner-Petter André,Aizawa Akiko,Gipp Bela
Abstract
Small to medium-scale data science experiments often rely on research software developed ad-hoc by individual scientists or small teams. Often there is no time to make the research software fast, reusable, and open access. The consequence is twofold. First, subsequent researchers must spend significant work hours building upon the proposed hypotheses or experimental framework. In the worst case, others cannot reproduce the experiment and reuse the findings for subsequent research. Second, suppose the ad-hoc research software fails during often long-running computational expensive experiments. In that case, the overall effort to iteratively improve the software and rerun the experiments creates significant time pressure on the researchers. We suggest making caching an integral part of the research software development process, even before the first line of code is written. This article outlines caching recommendations for developing research software in data science projects. Our recommendations provide a perspective to circumvent common problems such as propriety dependence, speed, etc. At the same time, caching contributes to the reproducibility of experiments in the open science workflow. Concerning the four guiding principles, i.e., Findability, Accessibility, Interoperability, and Reusability (FAIR), we foresee that including the proposed recommendation in a research software development will make the data related to that software FAIRer for both machines and humans. We exhibit the usefulness of some of the proposed recommendations on our recently completed research software project in mathematical information retrieval.
Reference13 articles.
1. “Automated symbolic and numerical testing of DLMF formulae using computer algebra systems,”;Cohl,2018
2. “Performance problems you can fix: A dynamic analysis of memoization opportunities,”;Della Toffola,2015
3. Architectural Styles and the Design of Network-Based Software Architectures (Ph.D. thesis)
FieldingR. T.
TaylorR. N.
Information and Computer Science2000
4. “Comparative verification of the digital library of mathematical functions and computer algebra systems,”;Greiner-Petter,2022
5. Semantic preserving bijective mappings for expressions involving special functions in computer algebra systems and document preparation systems;Greiner-Petter;Aslib J. Inf. Manag.,2019
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献