Affiliation:
1. Washington and Lee University, USA
Abstract
Political partisanship constitutes a pivotal group identity that significantly influences individuals’ voting behaviors and shapes their ideological and cultural perspectives. While traditional surveys and experimental studies can directly capture political identity by asking the participants, this task has become intricate when employing digital trace data sourced from social media. Previous classification methods, attempting to infer political identity from users’ networks or textual content, suffered from limited efficiency or generalizability. In response, this study introduces a two-step method that utilizes deep learning models to enhance classification efficiency, generalizability, and interpretability. In the first step, two deep learning models, trained on 2.5 million tweets from 825 Congressional politicians in the U.S., achieved accuracy rates of 87.71% and 89.54%, respectively, in detecting politicians’ partisanships based on their individual tweets. Subsequently, in the second step, by employing a simple machine learning model that leverages the aggregated predicted values derived from the first-step models, accuracy rates of 94.92% and 96.61% were attained for identifying non-politician users’ political identities based off their 50 and 200 tweets, respectively. In addition, an attention mechanism was integrated into the deep learning model to assess the contribution of each word in the classification process.
Subject
Law,Library and Information Sciences,Computer Science Applications,General Social Sciences
Reference60 articles.
1. Agarwal R. (2019) Attention, CNN and what not for text classification. https://towardsdatascience.com/nlp-learning-series-part-3-attention-cnn-and-what-not-for-text-classification-4313930ed566
2. The Media and Democracy: Using Democratic Theory in Journalism Ethics
3. Validating Wordscores: The Promises and Pitfalls of Computational Text Scaling