Understanding Software-2.0-Reference-Cited by-同舟云学术

Understanding Software-2.0

Published:2021-07 Issue:4 Volume:30 Page:1-42
ISSN:1049-331X
Container-title:ACM Transactions on Software Engineering and Methodology
language:en
Short-container-title:ACM Trans. Softw. Eng. Methodol.

Author:

Dilhara Malinda¹,Ketkar Ameya¹,Dig Danny¹

Affiliation:

1. Oregon State University and University of Colorado

Abstract

Enabled by a rich ecosystem of Machine Learning (ML) libraries, programming using learned models , i.e., Software-2.0 , has gained substantial adoption. However, we do not know what challenges developers encounter when they use ML libraries. With this knowledge gap, researchers miss opportunities to contribute to new research directions, tool builders do not invest resources where automation is most needed, library designers cannot make informed decisions when releasing ML library versions, and developers fail to use common practices when using ML libraries. We present the first large-scale quantitative and qualitative empirical study to shed light on how developers in Software-2.0 use ML libraries, and how this evolution affects their code. Particularly, using static analysis we perform a longitudinal study of 3,340 top-rated open-source projects with 46,110 contributors. To further understand the challenges of ML library evolution, we survey 109 developers who introduce and evolve ML libraries. Using this rich dataset we reveal several novel findings. Among others, we found an increasing trend of using ML libraries: The ratio of new Python projects that use ML libraries increased from 2% in 2013 to 50% in 2018. We identify several usage patterns including the following: (i) 36% of the projects use multiple ML libraries to implement various stages of the ML workflows, (ii) developers update ML libraries more often than the traditional libraries , (iii) strict upgrades are the most popular for ML libraries among other update kinds, (iv) ML library updates often result in cascading library updates, and (v) ML libraries are often downgraded (22.04% of cases). We also observed unique challenges when evolving and maintaining Software-2.0 such as (i) binary incompatibility of trained ML models and (ii) benchmarking ML models. Finally, we present actionable implications of our findings for researchers, tool builders, developers, educators, library vendors, and hardware vendors.

Funder

NSF

Publisher

Association for Computing Machinery (ACM)

Subject

Software

Link

https://dl.acm.org/doi/pdf/10.1145/3453478

Reference147 articles.

1. Building Useful Program Analysis Tools Using an Extensible Java Compiler

2. Software Documentation Issues Unveiled

Cited by 45 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Quality issues in machine learning software systems;Empirical Software Engineering;2024-09-11

2. Model-less Is the Best Model: Generating Pure Code Implementations to Replace On-Device DL Models;Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis;2024-09-11

3. An Exploratory Study on Machine Learning Model Management;ACM Transactions on Software Engineering and Methodology;2024-08-16

4. Unprecedented Code Change Automation: The Fusion of LLMs and Transformation by Example;Proceedings of the ACM on Software Engineering;2024-07-12

5. What causes exceptions in machine learning applications? Mining machine learning-related stack traces on Stack Overflow;Empirical Software Engineering;2024-07-03