Code Search: A Survey of Techniques for Finding Code

Author:

Di Grazia Luca1ORCID,Pradel Michael1ORCID

Affiliation:

1. Department of Computer Science, University of Stuttgart, Germany

Abstract

The immense amounts of source code provide ample challenges and opportunities during software development. To handle the size of code bases, developers commonly search for code, e.g., when trying to find where a particular feature is implemented or when looking for code examples to reuse. To support developers in finding relevant code, various code search engines have been proposed. This article surveys 30 years of research on code search, giving a comprehensive overview of challenges and techniques that address them. We discuss the kinds of queries that code search engines support, how to preprocess and expand queries, different techniques for indexing and retrieving code, and ways to rank and prune search results. Moreover, we describe empirical studies of code search in practice. Based on the discussion of prior work, we conclude the article with an outline of challenges and opportunities to be addressed in the future.

Funder

European Research Council

German Research Foundation within the ConcSys and DeMoCo

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science,Theoretical Computer Science

Reference139 articles.

1. 1998. ISO/IEC 14882 International Standard - First Edition 1998-09-01: Programming Languages C++. ISO.

2. Alfred V. Aho, Monica S. Lam, Ravi Sethi, and Jeffrey D. Ullman. 2007. Compilers: Principles, Techniques, & Tools. Pearson Education India.

3. S. Akbar and A. Kak. 2019. SCOR: Source code retrieval with semantics and order. In Proceedings of the IEEE/ACM 16th International Conference on Mining Software Repositories (MSR’19). 1–12.

4. Artem Babenko and Victor Lempitsky. 2016. Efficient indexing of billion-scale datasets of deep descriptors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2055–2063.

5. Sushil Bajracharya, Trung Ngo, Erik Linstead, Yimeng Dou, Paul Rigor, Pierre Baldi, and Cristina Lopes. 2006. Sourcerer: A search engine for open source code supporting structure-based search. In Companion to the 21st ACM SIGPLAN Symposium on Object-oriented Programming Systems, Languages, and Applications. 681–682.

Cited by 11 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Rapid: Zero-shot Domain Adaptation for Code Search with Pre-trained Models;ACM Transactions on Software Engineering and Methodology;2024-01-18

2. Survey of Code Search Based on Deep Learning;ACM Transactions on Software Engineering and Methodology;2023-12-23

3. Multi-intent Description of Keyword Expansion for Code Search;Communications in Computer and Information Science;2023-11-27

4. CCCS: Contrastive Cross-Language Code Search Using Code Graph Information;ACM Transactions on Asian and Low-Resource Language Information Processing;2023-11-06

5. RepoGraph: A Novel Semantic Code Exploration Tool for Python Repositories Based on Knowledge Graphs and Deep Learning;2023 IEEE 19th International Conference on e-Science (e-Science);2023-10-09

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3