Abstract
Not many efficient similarity detectors are employed in practice to maintain academic integrity. Perhaps it is because they lack intuitive reports for investigation, they only have a command line interface, and/or they are not publicly accessible. This paper presents SSTRANGE, an efficient similarity detector with locality-sensitive hashing (MinHash and Super-Bit). The tool features intuitive reports for investigation and a graphical user interface. Further, it is accessible on GitHub. SSTRANGE was evaluated on the SOCO dataset under two performance metrics: f-score and processing time. The evaluation shows that both MinHash and Super-Bit are more efficient than their predecessors (Cosine and Jaccard with 60% less processing time) and a common similarity measurement (running Karp-Rabin greedy string tiling with 99% less processing time). Further, the effectiveness trade-off is still reasonable (no more than 24%). Higher effectiveness can be obtained by tuning the number of clusters and stages. To encourage the use of automated similarity detectors, we provide ten recommendations for instructors interested in employing such detectors for the first time. These include consideration of assessment design, irregular patterns of similarity, multiple similarity measurements, and effectiveness–efficiency trade-off. The recommendations are based on our 2.5-year experience employing similarity detectors (SSTRANGE’s predecessors) in 13 course offerings with various assessment designs.
Subject
Public Administration,Developmental and Educational Psychology,Education,Computer Science Applications,Computer Science (miscellaneous),Physical Therapy, Sports Therapy and Rehabilitation
Reference59 articles.
1. Collaboration, collusion and plagiarism in computer science coursework;Fraser;Inform. Educ.,2014
2. Lancaster, T. (2018). Higher Education Computer Science, Springer Nature.
3. Sheard, J., Morgan, M., Petersen, A., Settle, A., and Sinclair, J. (February, January 30). Informing students about academic integrity in programming. Proceedings of the 20th Australasian Computing Education Conference, Brisbane, QLD, Australia.
4. Kustanto, C., and Liem, I. (2009, January 27–29). Automatic source code plagiarism detection. Proceedings of the 10th ACIS International Conference on Software Engineering, Artificial Intelligences, Networking and Parallel/Distributed Computing, Daegu, Republic of Korea.
5. Schleimer, S., Wilkerson, D.S., and Aiken, A. (2003, January 9–12). Winnowing: Local algorithms for document fingerprinting. Proceedings of the International Conference on Management of Data, San Diego, CA, USA.
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献