An Empirical Study on Focal Methods in Deep-Learning-Based Approaches for Assertion Generation

Author:

He Yibo1ORCID,Huang Jiaming1ORCID,Yu Hao1ORCID,Xie Tao1ORCID

Affiliation:

1. Peking University, Beijing, China

Abstract

Unit testing is widely recognized as an essential aspect of the software development process. Generating high-quality assertions automatically is one of the most important and challenging problems in automatic unit test generation. To generate high-quality assertions, deep-learning-based approaches have been proposed in recent years. For state-of-the-art d eep- l earning-based approaches for a ssertion g eneration (DLAGs), the focal method (i.e., the main method under test) for a unit test case plays an important role of being a required part of the input to these approaches. To use DLAGs in practice, there are two main ways to provide a focal method for these approaches: (1) manually providing a developer-intended focal method or (2) identifying a likely focal method from the given test prefix (i.e., complete unit test code excluding assertions) with test-to-code traceability techniques. However, the state-of-the-art DLAGs are all evaluated on the ATLAS dataset, where the focal method for a test case is assumed as the last non-JUnit-API method invoked in the complete unit test code (i.e., code from both the test prefix and assertion portion). There exist two issues of the existing empirical evaluations of DLAGs, causing inaccurate assessment of DLAGs toward adoption in practice. First, it is unclear whether the last method call before assertions (LCBA) technique can accurately reflect developer-intended focal methods. Second, when applying DLAGs in practice, the assertion portion of a unit test is not available as a part of the input to DLAGs (actually being the output of DLAGs); thus, the assumption made by the ATLAS dataset does not hold in practical scenarios of applying DLAGs. To address the first issue, we conduct a study of seven test-to-code traceability techniques in the scenario of assertion generation. We find that the LCBA technique is not the best among the seven techniques and can accurately identify focal methods with only 43.38% precision and 38.42% recall; thus, the LCBA technique cannot accurately reflect developer-intended focal methods, raising a concern on using the ATLAS dataset for evaluation. To address the second issue along with the concern raised by the preceding finding, we apply all seven test-to-code traceability techniques , respectively, to identify focal methods automatically from only test prefixes and construct a new dataset named ATLAS+ by replacing the existing focal methods in the ATLAS dataset with the focal methods identified by the seven traceability techniques, respectively. On a test set from new ATLAS+, we evaluate four state-of-the-art DLAGs trained on a training set from the ATLAS dataset. We find that all of the four DLAGs achieve lower accuracy on a test set in ATLAS+ than the corresponding test set in the ATLAS dataset, indicating that DLAGs should be (re)evaluated with a test set in ATLAS+, which better reflects practical scenarios of providing focal methods than the ATLAS dataset. In addition, we evaluate state-of-the-art DLAGs trained on training sets in ATLAS+. We find that using training sets in ATLAS+ helps effectively improve the accuracy of the ATLAS approach and T5 approach over these approaches trained using the corresponding training set from the ATLAS dataset.

Publisher

Association for Computing Machinery (ACM)

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3