Optimized Tokenization Process for Open-Vocabulary Code Completion: An Empirical Study-Reference-Cited by-同舟云学术

Optimized Tokenization Process for Open-Vocabulary Code Completion: An Empirical Study

Published:2023-06-14 Issue: Volume: Page:
ISSN:
Container-title:Proceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering
language:
Short-container-title:

Author:

Hussain Yasir¹^ORCID,Huang Zhiqiu¹^ORCID,Zhou Yu¹^ORCID,Khan Izhar Ahmed¹^ORCID,Khan Nasrullah²^ORCID,Abbas Muhammad Zahid²^ORCID

Affiliation:

1. Collage of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, China

2. Comsats University, Pakistan

Funder

National Natural Science Foundation of China

Publisher

ACM

Link

https://dl.acm.org/doi/pdf/10.1145/3593434.3594236

Reference37 articles.

1. Wasi Uddin Ahmad , Saikat Chakraborty , Baishakhi Ray , and Kai-Wei Chang . 2021. Unified Pre-training for Program Understanding and Generation. arXiv preprint arXiv:2103.06333 ( 2021 ). Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray, and Kai-Wei Chang. 2021. Unified Pre-training for Program Understanding and Generation. arXiv preprint arXiv:2103.06333 (2021).

2. Kyunghyun Cho , Bart Van Merriënboer , Dzmitry Bahdanau , and Yoshua Bengio . 2014. On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv:1409.1259 ( 2014 ). Kyunghyun Cho, Bart Van Merriënboer, Dzmitry Bahdanau, and Yoshua Bengio. 2014. On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv:1409.1259 (2014).

3. An Empirical Study on the Usage of Transformer Models for Code Completion

4. Jacob Devlin , Ming-Wei Chang , Kenton Lee , and Kristina Toutanova . 2018 . Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018). Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).

5. Yangruibo Ding , Luca Buratti , Saurabh Pujar , Alessandro Morari , Baishakhi Ray , and Saikat Chakraborty . 2021. Contrastive Learning for Source Code with Structural and Functional Properties. arXiv preprint arXiv:2110.03868 ( 2021 ). Yangruibo Ding, Luca Buratti, Saurabh Pujar, Alessandro Morari, Baishakhi Ray, and Saikat Chakraborty. 2021. Contrastive Learning for Source Code with Structural and Functional Properties. arXiv preprint arXiv:2110.03868 (2021).

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Data Pre-Processing Framework for Kannada Vachana Sahitya;2024 International Conference on Advances in Modern Age Technologies for Health and Engineering Science (AMATHE);2024-05-16

2. Greening Large Language Models of Code;Proceedings of the 46th International Conference on Software Engineering: Software Engineering in Society;2024-04-14

3. Exploring the Impact of Vocabulary Techniques on Code Completion: A Comparative Approach;International Journal of Software Engineering and Knowledge Engineering;2024-01-13