Affiliation:
1. Tahar Moulay University of Saida, Algeria
2. Department of Computer Science, Tahar Moulay University of Saida, Algeria, Saida, Algeria
3. GeCoDe Laboratory, Department of Computer Sciences, Dr. Tahar Moulay University of Saida, Algeria
Abstract
In the last decade, the plagiarism cases were increased and become a topical problem in the modern scientific world, caused by the quantity of textual information available online/offline. The authors' work deals on the development of a new plagiarism detector system called BHA2 which has as input the suspicious text (to be analysed) and the original texts (learning basis). It can detect the different forms of plagiarism based on: Google API to detect the cases of plagiarism with translation; text summarization to detect the plagiarism of idea; conceptual transformation to detect the plagiarism with synonymy; bag of phrases to detect the paraphraser plagiarism; the social worker bees algorithm that was inspired from the lifestyle of social worker bees (forager, guardian, and cleaner) to select the documents source of plagiarism; the output of the authors' system are the plagiarised passages (the copied parts from the original texts) and the plagiarism percentage for each suspicious text. Their experiments were performed on the Pan 09 dataset and using the validation measures (recall, precision, accuracy, error, f-measure, and entropy, FPR, FNR, W-accuracy, ROC and TCR) in order to show the benefit derived from using such idea compared to the result of classical systems existed in literature. A comparative study in term of services was realised between their system and others commercial systems such as (check, Turnitin, and machine learning system) with their system. Finally, a visualization step was achieved for the purpose to see the outcome in graphical form (3d cub and cobweb) with more realism using the functionalities of zooming and rotation.
Reference31 articles.
1. PDLK: Plagiarism detection using linguistic knowledge
2. Androutsopoulos, I., Koutsias, J., Chandrinos, K. V., Paliouras, G., & Spyropoulos, C. D. (2000). An evaluation of naive Bayesian anti-spam filtering. arXiv preprint cs/0006013.
3. On Automatic Plagiarism Detection Based on n-Grams Comparison
4. Basile, C. (2009). A plagiarism detection procedure in three steps: selection, matches and squares. Proceedings of the SEPLN ’09 and 09 3rd workshop and 1st international competition on plagiarism, San Sebastian, Spain (pp. 19-23). IEEE.
5. Nature-inspired techniques in the context of fraud detection.;M.Behdad;IEEE Transactions on,2012