DRIVE: Dockerfile Rule Mining and Violation Detection
-
Published:2023-12-21
Issue:2
Volume:33
Page:1-23
-
ISSN:1049-331X
-
Container-title:ACM Transactions on Software Engineering and Methodology
-
language:en
-
Short-container-title:ACM Trans. Softw. Eng. Methodol.
Author:
Zhou Yu1ORCID,
Zhan Weilin1ORCID,
Li Zi1ORCID,
Han Tingting2ORCID,
Chen Taolue2ORCID,
Gall Harald3ORCID
Affiliation:
1. Nanjing University of Aeronautics and Astronautics, China
2. Birkbeck, University of London, UK
3. University of Zurich, Switzerland
Abstract
A Dockerfile defines a set of instructions to build Docker images, which can then be instantiated to support containerized applications. Recent studies have revealed a considerable amount of quality issues with Dockerfiles. In this article, we propose a novel approach, Dockerfiles Rule mIning and Violation dEtection (
DRIVE
), to mine implicit rules and detect potential violations of such rules in Dockerfiles.
DRIVE
first parses Dockerfiles and transforms them to an intermediate representation. It then leverages an efficient sequential pattern mining algorithm to extract potential patterns. With heuristic-based reduction and moderate human intervention, potential rules are identified, which can then be utilized to detect potential violations of Dockerfiles.
DRIVE
identifies 34 semantic rules and 19 syntactic rules including 9 new semantic rules that have not been reported elsewhere. Extensive experiments on real-world Dockerfiles demonstrate the efficacy of our approach.
Funder
National Natural Science Foundation of China
Natural Science Foundation of Jiangsu Province
Fundamental Research Funds for the Central Universities
Birkbeck BEI School Project (EFFECT), an oversea grant from the State Key Laboratory of Novel Software Technology, Nanjing University
Publisher
Association for Computing Machinery (ACM)
Reference49 articles.
1. [n. d.]. Best Practices for Writing Dockerfiles. Retrieved July 22 2022 from https://docs.docker.com/develop/develop-images/dockerfile_best-practices/
2. [n. d.]. Hadolint. Retrieved July 22 2022 from https://github.com/hadolint/hadolint/
3. Matej Artac, Tadej Borovssak, Elisabetta Di Nitto, Michele Guerriero, and Damian Andrew Tamburri. 2017. DevOps: Introducing infrastructure-as-code. In Proceedings of the IEEE/ACM 39th International Conference on Software Engineering Companion (ICSE-C’17). IEEE, 497–498.
4. An empirical study on self-admitted technical debt in Dockerfiles