Post-Authorship Attribution Using Regularized Deep Neural Network-Reference-Cited by-同舟云学术

Post-Authorship Attribution Using Regularized Deep Neural Network

Published:2022-07-26 Issue:15 Volume:12 Page:7518
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Modupe Abiodun^ORCID,Celik Turgay^ORCID,Marivate Vukosi^ORCID,Olugbara Oludayo^ORCID

Abstract

Post-authorship attribution is a scientific process of using stylometric features to identify the genuine writer of an online text snippet such as an email, blog, forum post, or chat log. It has useful applications in manifold domains, for instance, in a verification process to proactively detect misogynistic, misandrist, xenophobic, and abusive posts on the internet or social networks. The process assumes that texts can be characterized by sequences of words that agglutinate the functional and content lyrics of a writer. However, defining an appropriate characterization of text to capture the unique writing style of an author is a complex endeavor in the discipline of computational linguistics. Moreover, posts are typically short texts with obfuscating vocabularies that might impact the accuracy of authorship attribution. The vocabularies include idioms, onomatopoeias, homophones, phonemes, synonyms, acronyms, anaphora, and polysemy. The method of the regularized deep neural network (RDNN) is introduced in this paper to circumvent the intrinsic challenges of post-authorship attribution. It is based on a convolutional neural network, bidirectional long short-term memory encoder, and distributed highway network. The neural network was used to extract lexical stylometric features that are fed into the bidirectional encoder to extract a syntactic feature-vector representation. The feature vector was then supplied as input to the distributed high networks for regularization to minimize the network-generalization error. The regularized feature vector was ultimately passed to the bidirectional decoder to learn the writing style of an author. The feature-classification layer consists of a fully connected network and a SoftMax function to make the prediction. The RDNN method was tested against thirteen state-of-the-art methods using four benchmark experimental datasets to validate its performance. Experimental results have demonstrated the effectiveness of the method when compared to the existing state-of-the-art methods on three datasets while producing comparable results on one dataset.

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/12/15/7518/pdf

Reference102 articles.

1. Learning Stylometric Representations for Authorship Analysis

2. Detecting Traffic Information From Social Media Texts With Deep Learning Approaches

3. Why We Twitter: An Analysis of a Microblogging Community;Java,2007

4. Twitter—Wikipedia https://en.wikipedia.org/wiki/Twitter

5. Applied Text Analytics for Blogs. Universiteit van Amsterdam http://brenocon.com/gilad_mishne_phd_thesis_ch6.pdf

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Authorship Attribution in Less-Resourced Languages: A Hybrid Transformer Approach for Romanian;Applied Sciences;2024-03-23

2. Integrating Bidirectional Long Short-Term Memory with Subword Embedding for Authorship Attribution;2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC);2023-10-01

3. A Transformer-Based Approach to Authorship Attribution in Classical Arabic Texts;Applied Sciences;2023-06-18