AlgoLabel: A Large Dataset for Multi-Label Classification of Algorithmic Challenges-Reference-Cited by-同舟云学术

AlgoLabel: A Large Dataset for Multi-Label Classification of Algorithmic Challenges

Published:2020-11-09 Issue:11 Volume:8 Page:1995
ISSN:2227-7390
Container-title:Mathematics
language:en
Short-container-title:Mathematics

Author:

Iacob Radu Cristian Alexandru,Monea Vlad Cristian^ORCID,Rădulescu Dan,Ceapă Andrei-Florin,Rebedea Traian^ORCID,Trăușan-Matu Ștefan

Abstract

While semantic parsing has been an important problem in natural language processing for decades, recent years have seen a wide interest in automatic generation of code from text. We propose an alternative problem to code generation: labelling the algorithmic solution for programming challenges. While this may seem an easier task, we highlight that current deep learning techniques are still far from offering a reliable solution. The contributions of the paper are twofold. First, we propose a large multi-modal dataset of text and code pairs consisting of algorithmic challenges and their solutions, called AlgoLabel. Second, we show that vanilla deep learning solutions need to be greatly improved to solve this task and we propose a dual text-code neural model for detecting the algorithmic solution type for a programming challenge. While the proposed text-code model increases the performance of using the text or code alone, the improvement is rather small highlighting that we require better methods to combine text and code features.

Publisher

MDPI AG

Subject

General Mathematics,Engineering (miscellaneous),Computer Science (miscellaneous)

Link

https://www.mdpi.com/2227-7390/8/11/1995/pdf

Reference47 articles.

1. TRANX: A Transition-based Neural Abstract Syntax Parser for Semantic Parsing and Code Generation

2. Learning to mine aligned code and natural language pairs from stack overflow

3. Learning to Generate Pseudo-Code from Source Code Using Statistical Machine Translation

4. Predicting Algorithm Classes for Programming Word Problems

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Complexity-Based Code Embeddings;Computational Collective Intelligence;2023

2. Unsupervised Detection of Solving Strategies for Competitive Programming;Intelligent Data Engineering and Automated Learning – IDEAL 2021;2021