Pre-implementation Method Name Prediction for Object-oriented Programming-Reference-Cited by-同舟云学术

Pre-implementation Method Name Prediction for Object-oriented Programming

Published:2023-09-29 Issue:6 Volume:32 Page:1-35
ISSN:1049-331X
Container-title:ACM Transactions on Software Engineering and Methodology
language:en
Short-container-title:ACM Trans. Softw. Eng. Methodol.

Author:

Wang Shangwen¹^ORCID,Wen Ming²^ORCID,Lin Bo¹^ORCID,Liu Yepang³^ORCID,Bissyandé Tegawendé F.⁴^ORCID,Mao Xiaoguang¹^ORCID

Affiliation:

1. Key Laboratory of Software Engineering for Complex Systems, College of Computer Science, National University of Defense Technology, Changsha, China

2. School of Cyber Science and Engineering, Huazhong University of Science and Technology, Wuhan, China

3. Research Institute of Trustworthy Autonoumous Systems and Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, China

4. University of Luxembourg, Luxembourg

Abstract

Method naming is a challenging development task in object-oriented programming. In recent years, several research efforts have been undertaken to provide automated tool support for assisting developers in this task. In general, literature approaches assume the availability of method implementation to infer its name. Methods, however, are usually named before their implementations. In this work, we fill the gap in the literature about method name prediction by developing an approach that predicts the names of all methods to be implemented within a class. Our work considers the class name as the input: The overall intuition is that classes with semantically similar names tend to provide similar functionalities, and hence similar method names. We first conduct a large-scale empirical analysis on 258K+ classes from real-world projects to validate our hypotheses. Then, we propose a hybrid big code-driven approach, Mario , to predict method names based on the class name: We combine a deep learning model with heuristics summarized from code analysis. Extensive experiments on 22K+ classes yielded promising results: compared to the state-of-the-art code2seq model (which leverages method implementation data), our approach achieves comparable results in terms of F-score at token-level prediction; our approach, additionally, outperforms code2seq in prediction at the name level. We further show that our approach significantly outperforms several other baselines.

Funder

National Natural Science Foundation of China

Young Elite Scientists Sponsorship Program by CAST

European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme

Publisher

Association for Computing Machinery (ACM)

Subject

Software

Link

https://dl.acm.org/doi/pdf/10.1145/3597203

Reference91 articles.

1. Jordan Brown. 2022. 15 Java Coding Best Practices. Retrieved from https://xperti.io/blogs/java-coding-best-practices/.

2. Eclipse Foundation. 2022. Eclipse Foundation. Retrieved from https://www.eclipse.org/.

3. Martin O’Connor. 2022. Java Tutorial - How To Write A Method. Retrieved from https://www.youtube.com/watch?v=qQDGYfQPpGg.

4. NLTK. 2022. Stanford CoreNLP. Retrieved from https://github.com/nltk/nltk/wiki/Stanford-CoreNLP-API-in-NLTK.

5. Margaret Reid-Miller. 2022. Writing New Java Classes. Retrieved from https://www.cs.cmu.edu/mrmiller/15-110/Handouts/writingClasses.pdf.