Three Approaches for Detecting Direct Output Cheating in Program Online Judge Systems-Reference-Cited by-同舟云学术

Three Approaches for Detecting Direct Output Cheating in Program Online Judge Systems

Published:2023-03-30 Issue:04 Volume:33 Page:461-486
ISSN:0218-1940
Container-title:International Journal of Software Engineering and Knowledge Engineering
language:en
Short-container-title:Int. J. Soft. Eng. Knowl. Eng.

Author:

Qiu Jing¹^ORCID,Shi Chunmei¹,Lv Yuehua²

Affiliation:

1. College of Mathematics and Computer Science, Zhejiang A&F University, Hangzhou 311300, P. R. China

2. Institute of Scientific and Technical Information of Zhejiang Province, Hangzhou 311300, P. R. China

Abstract

Program online judge (POJ) systems allow students to view questions, submit solution code, and receive scores automatically via the web. Most POJs use test cases for scoring. When a POJ is scored by test case pass rate or a problem that has only one test case, students can usually score by providing the direct output of the test cases (direct output cheating). Currently, there is only one work on detecting such cheating. However, its precision is very low. To solve this problem, three novel approaches are proposed to detect direct output cheating: (i) Line Statistics, which computes the proportion of output calls against other statements; (ii) the control flow graph (CFG) Search computes the maximum similarity between the CFG of a program and that of known samples; (iii) abstract syntax tree (AST) Search identifies cheating by matching rules that are summarized from ASTs of previously detected cheating attempts. A student’s code is marked as cheating if the similarity exceeds a predefined threshold; and a program is detected as cheating if the proportion exceeds a predefined threshold. The proposed approaches and three well-known code plagiarism detection tools (JPlag, Sherlock, and SIM) were evaluated using 100,000 submissions for 1153 problems from a POJ based on the C programming language. The F1 scores of these approaches were determined as 0.9752 (AST Search), 0.9440 (CFG Search), 0.7405 (Line Statistics), 0.6446 (JPlag), 0.1587 (Sherlock), and 0.0076 (SIM), respectively. The result indicates that (i) AST Search is most suitable for the detection of direct output cheating; (ii) traditional code search or plagiarism detection methods based on similarity calculations are not effective for complex cheat detection because these cheats are highly similar to normal code.

Funder

National Natural Science Foundation of China

Natural Science Foundation of Heilongjiang Province

Fundamental Research Foundation for Universities of Heilongjiang Province

Zhejiang A and F University Research Development Fund Talent Initiation Project

Publisher

World Scientific Pub Co Pte Ltd

Subject

Artificial Intelligence,Computer Graphics and Computer-Aided Design,Computer Networks and Communications,Software

Link

https://www.worldscientific.com/doi/pdf/10.1142/S0218194023500043

Reference32 articles.

1. Automatic test-based assessment of programming

2. A Survey on Online Judge Systems and Their Applications

3. Online Judge System: Requirements, Architecture, and Experiences

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Research on Automatic Scoring Method of Intelligent Translation System Based on TSO Optimized LSTM Networks;ICST Transactions on Scalable Information Systems;2024-02-20