HDL-ODPRs: A Hybrid Deep Learning Technique Based Optimal Duplication Detection for Pull-Requests in Open-Source Repositories-Reference-Cited by-同舟云学术

HDL-ODPRs: A Hybrid Deep Learning Technique Based Optimal Duplication Detection for Pull-Requests in Open-Source Repositories

Published:2022-12-08 Issue:24 Volume:12 Page:12594
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Alotaibi Saud S.^ORCID

Abstract

Recently, open-source repositories have grown rapidly due to volunteer contributions worldwide. Collaboration software platforms have gained popularity as thousands of external contributors have contributed to open-source repositories. Although data de-duplication decreases the size of backup workloads, this causes poor data locality (fragmentation) and redundant review time and effort. Deep learning and machine learning techniques have recently been applied to identify complex bugs and duplicate issue reports. It is difficult to use, but it increases the risk of developers submitting duplicate pull requests, resulting in additional maintenance costs. We propose a hybrid deep learning technique in this work on the basis of an optimal duplication detection is for pull requests (HDL-ODPRs) in open-source repositories. An algorithm used to extract textual data from pull requests is hybrid leader-based optimization (HLBO), which increases the accuracy of duplicate detection. Following that, we compute the similarities between pull requests by utilizing the multiobjective alpine skiing optimization (MASO) algorithm, which provides textual, file-change, and code-change similarities. For pull request duplicate detection, a hybrid deep learning technique (named GAN-GS) is introduced, in which the global search (GS) algorithm is used to optimize the design metrics of the generative adversarial network (GAN). The proposed HDL-ODPR model is validated against the public standard benchmark datasets, such as DupPR-basic and DupPR-complementary data. According to the simulation results, the proposed HDL-ODPR model can achieve promising results in comparison with existing state-of-the-art models.

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/12/24/12594/pdf

Reference33 articles.

1. The FreeBSD project: A replication case study of open source development;Bieman;IEEE Trans. Softw. Eng.,2005

2. Automatic mining of source code repositories to improve bug finding techniques;Williams;IEEE Trans. Softw. Eng.,2005

3. A global view of standards for open image data formats and repositories;Swedlow;Nat. Methods,2021

4. An open source web application for distributed geospatial data exploration;Curry;Sci. Data,2019

5. Trustrace: Mining software repositories to improve the accuracy of requirement traceability links;Ali;IEEE Trans. Softw. Eng.,2012