Affiliation:
1. Department of Computer Science, University of Stuttgart, Germany
Abstract
The immense amounts of source code provide ample challenges and opportunities during software development. To handle the size of code bases, developers commonly search for code, e.g., when trying to find where a particular feature is implemented or when looking for code examples to reuse. To support developers in finding relevant code, various code search engines have been proposed. This article surveys 30 years of research on code search, giving a comprehensive overview of challenges and techniques that address them. We discuss the kinds of queries that code search engines support, how to preprocess and expand queries, different techniques for indexing and retrieving code, and ways to rank and prune search results. Moreover, we describe empirical studies of code search in practice. Based on the discussion of prior work, we conclude the article with an outline of challenges and opportunities to be addressed in the future.
Funder
European Research Council
German Research Foundation within the ConcSys and DeMoCo
Publisher
Association for Computing Machinery (ACM)
Subject
General Computer Science,Theoretical Computer Science
Reference139 articles.
1. 1998. ISO/IEC 14882 International Standard - First Edition 1998-09-01: Programming Languages C++. ISO.
2. Alfred V. Aho, Monica S. Lam, Ravi Sethi, and Jeffrey D. Ullman. 2007. Compilers: Principles, Techniques, & Tools. Pearson Education India.
3. S. Akbar and A. Kak. 2019. SCOR: Source code retrieval with semantics and order. In Proceedings of the IEEE/ACM 16th International Conference on Mining Software Repositories (MSR’19). 1–12.
4. Artem Babenko and Victor Lempitsky. 2016. Efficient indexing of billion-scale datasets of deep descriptors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2055–2063.
5. Sushil Bajracharya, Trung Ngo, Erik Linstead, Yimeng Dou, Paul Rigor, Pierre Baldi, and Cristina Lopes. 2006. Sourcerer: A search engine for open source code supporting structure-based search. In Companion to the 21st ACM SIGPLAN Symposium on Object-oriented Programming Systems, Languages, and Applications. 681–682.
Cited by
22 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Code search engines for the next generation;Journal of Systems and Software;2024-09
2. An Empirical Study on Code Search Pre-trained Models: Academic Progresses vs. Industry Requirements;Proceedings of the 15th Asia-Pacific Symposium on Internetware;2024-07-24
3. An Empirical Study of Code Search in Intelligent Coding Assistant: Perceptions, Expectations, and Directions;Companion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering;2024-07-10
4. CodeFuse: Multimodal Code Search Model with Fine-Grained Attention Alignment;2024 IEEE 48th Annual Computers, Software, and Applications Conference (COMPSAC);2024-07-02
5. Fusing Code Searchers;IEEE Transactions on Software Engineering;2024-07