Affiliation:
1. USTHB University, Bab Ezzouar, Algeria
Abstract
Through the fast development and intensification of the large volume of data via the internet, visual analytics (VA) comes out with the intention of visualizing multidimensional data in different ways, which reveals interesting information about the data, making them clearer and more intelligible. In this investigation, the authors focused on the VA based Authorship Attribution (AA) task, applied on noisy text data. Furthermore, this article proposes 3D Visual Analytics technique based on sphere implementation. The used dataset contains several text documents written by 5 American Philosophers, with an average length of 850 words per text, which were scanned and then corrupted with different noise levels. The obtained results show that the hierarchical clustering technique using a fully-automated threshold, presents high performance in terms of authorship attribution accuracy, especially with character trigrams and ending bigrams, where the clustering recognition rate (CRR) reaches an accuracy of 100% at noise levels: from 0% to 7%. In addition, the proposed 3D sphere technique appears quite interesting by showing high clustering performances, mainly with Words.
Subject
Artificial Intelligence,Management of Technology and Innovation,Information Systems and Management,Organizational Behavior and Human Resource Management,Strategy and Management,Information Systems
Reference57 articles.
1. Evaluation of authorship attribution software on a Chat bot corpus
2. An Empirical Evaluation of Salt and Pepper Noise Removal for Document Images using Median Filter
3. Borgatti, S.P. (1994). How to Explain Hierarchical Clustering. INSNA, 17(2), 78-80. http://www.analytictech.com/networks/hiclus.htm
4. A review paper: Noise models in digital image processing. Signal & Image Processing;A. K.Boyat,2015
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献