A Support Vector Machine based approach for plagiarism detection in Python code submissions in undergraduate settings-Reference-Cited by-同舟云学术

A Support Vector Machine based approach for plagiarism detection in Python code submissions in undergraduate settings

Published:2024-06-13 Issue: Volume:6 Page:
ISSN:2624-9898
Container-title:Frontiers in Computer Science
language:
Short-container-title:Front. Comput. Sci.

Author:

Gandhi Nandini,Gopalan Kaushik,Prasad Prajish

Abstract

Mechanisms for plagiarism detection play a crucial role in maintaining academic integrity, acting both to penalize wrongdoing while also serving as a preemptive deterrent for bad behavior. This manuscript proposes a customized plagiarism detection algorithm tailored to detect source code plagiarism in the Python programming language. Our approach combines textual and syntactic techniques, employing a support vector machine (SVM) to effectively combine various indicators of similarity and calculate the resulting similarity scores. The algorithm was trained and tested using a sample of code submissions of 4 coding problems each from 45 volunteers; 15 of these were original submissions while the other 30 were plagiarized samples. The submissions of two of the questions was used for training and the other two for testing-using the leave-p-out cross-validation strategy to avoid overfitting. We compare the performance of the proposed method with two widely used tools-MOSS and JPlag—and find that the proposed method results in a small but significant improvement in accuracy compared to JPlag, while significantly outperforming MOSS in flagging plagiarized samples.

Publisher

Frontiers Media SA

Reference28 articles.

1. A state of art on source code plagiarism detection;Agrawal,2016

2. A comparison of three popular source code similarity tools for detecting student plagiarism;Ahadi;Proceedings of the Twenty-First Australasian Computing Education Conference,2019

3. AikenA. A System for Detecting Software Similarity2023

4. Plagiarism detection in programming assignments using machine learning;Awale;J. Artif. Intellig. Capsule Netw,2020