Dataset of Program Source Codes Solving Unique Programming Exercises Generated by Digital Teaching Assistant-Reference-Cited by-同舟云学术

Dataset of Program Source Codes Solving Unique Programming Exercises Generated by Digital Teaching Assistant

Published:2023-06-14 Issue:6 Volume:8 Page:109
ISSN:2306-5729
Container-title:Data
language:en
Short-container-title:Data

Author:

Demidova Liliya A.¹^ORCID,Andrianova Elena G.¹^ORCID,Sovietov Peter N.¹^ORCID,Gorchakov Artyom V.¹^ORCID

Affiliation:

1. Institute of Information Technologies, Federal State Budget Educational Institution of Higher Education, MIREA—Russian Technological University, 78, Vernadsky Avenue, 119454 Moscow, Russia

Abstract

This paper presents a dataset containing automatically collected source codes solving unique programming exercises of different types. The programming exercises were automatically generated by the Digital Teaching Assistant (DTA) system that automates a massive Python programming course at MIREA—Russian Technological University (RTU MIREA). Source codes of the small programs grouped by the type of the solved task can be used for benchmarking source code classification and clustering algorithms. Moreover, the data can be used for training intelligent program synthesizers or benchmarking mutation testing frameworks, and more applications are yet to be discovered. We describe the architecture of the DTA system, aiming to provide detailed insight regarding how and why the dataset was collected. In addition, we describe the algorithms responsible for source code analysis in the DTA system. These algorithms use vector representations of programs based on Markov chains, compute pairwise Jensen–Shannon divergences of programs, and apply hierarchical clustering algorithms in order to automatically discover high-level concepts used by students while solving unique tasks. The proposed approach can be incorporated into massive programming courses when there is a need to identify approaches implemented by students.

Publisher

MDPI AG

Subject

Information Systems and Management,Computer Science Applications,Information Systems

Link

https://www.mdpi.com/2306-5729/8/6/109/pdf

Reference45 articles.

1. A Comparative Study of Industrial Static Analysis Tools;Emanuelsson;Electron. Notes Theor. Comput. Sci.,2008

2. Using Static Analysis to Find Bugs;Ayewah;IEEE Softw.,2008

3. Jiang, H., Yang, H., Qin, S., Su, Z., Zhang, J., and Yan, J. (2017, January 13–17). Detecting Energy Bugs in Android Apps Using Static Analysis. Proceedings of the Formal Methods and Software Engineering: 19th International Conference on Formal Engineering Methods, ICFEM 2017, Xi’an, China.

4. McPeak, S., Gros, C.H., and Ramanathan, M.K. (2013, January 18–26). Scalable and Incremental Software Bug Detection. Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering, Saint Petersburg, Russia.

5. Cyclomatic complexity;Ebert;IEEE Softw.,2016

Cited by 6 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A Rule-Based Algorithm and Its Specializations for Measuring the Complexity of Software in Educational Digital Environments;Computers;2024-03-11

2. The Robots Are Here: Navigating the Generative AI Revolution in Computing Education;Proceedings of the 2023 Working Group Reports on Innovation and Technology in Computer Science Education;2023-12-22

3. An Approach to Identifying Suspicious Student Activities During Online Programming Training Based on One-Class Classifiers;2023 5th International Conference on Control Systems, Mathematical Modeling, Automation and Energy Efficiency (SUMMA);2023-11-08

4. Algorithm for Detecting Anomalous Student Activities in the Online Learning Process Based on Box Plots;2023 5th International Conference on Control Systems, Mathematical Modeling, Automation and Energy Efficiency (SUMMA);2023-11-08

5. Analysis of Program Representations Based on Abstract Syntax Trees and Higher-Order Markov Chains for Source Code Classification Task;Future Internet;2023-09-18