AROMA: Automatic Reproduction of Maven Artifacts

Author:

Keshani Mehdi1ORCID,Velican Tudor-Gabriel2ORCID,Bot Gideon1ORCID,Proksch Sebastian1ORCID

Affiliation:

1. Delft University of Technology, Delft, Netherlands

2. Delft University of Technology, Amsterdam, Netherlands

Abstract

Modern software engineering establishes software supply chains and relies on tools and libraries to improve productivity. However, reusing external software in a project presents a security risk when the source of the component is unknown or the consistency of a component cannot be verified. The SolarWinds attack serves as a popular example in which the injection of malicious code into a library affected thousands of customers and caused a loss of billions of dollars. Reproducible builds present a mitigation strategy, as they can confirm the origin and consistency of reused components. A large reproducibility community has formed for Debian, but the reproducibility of the Maven ecosystem, the backbone of the Java supply chain, remains understudied in comparison. Reproducible Central is an initiative that curates a list of reproducible Maven libraries, but the list is limited and challenging to maintain due to manual efforts. Our research aims to support these efforts in the Maven ecosystem through automation. We investigate the feasibility of automatically finding the source code of a library from its Maven release and recovering information about the original release environment. Our tool, AROMA, can obtain this critical information from the artifact and the source repository through several heuristics and we use the results for reproduction attempts of Maven packages. Overall, our approach achieves an accuracy of up to 99.5% when compared field-by-field to the existing manual approach. In some instances, we even detected flaws in the manually maintained list, such as broken repository links. We reveal that automatic reproducibility is feasible for 23.4% of the Maven packages using AROMA, and 8% of these packages are fully reproducible. We demonstrate our ability to successfully reproduce new packages and have contributed some of them to the Reproducible Central repository. Additionally, we highlight actionable insights, outline future work in this area, and make our dataset and tools available to the public.

Funder

This study is funded by a European H2020 project, FASTEN

Publisher

Association for Computing Machinery (ACM)

Reference65 articles.

1. Why do developers use trivial packages? an empirical case study on npm

2. Apache. 2023. apache repository. https://infra.apache.org/blog/relocation-of-apache-git-repositories Accessed: 2023-08-22

3. Apache. 2023. Replacing Build-Jdk with Build-Jdk-Spec Github. https://github.com/apache/maven-archiver/pull/2/files Accessed: 2023-09-25

4. Apache. 2023. Replacing Build-Jdk with Build-Jdk-Spec Jira. https://issues.apache.org/jira/browse/MSHARED-797 Accessed: 2023-09-26

5. How the Apache community upgrades dependencies: an evolutionary study

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3