Affiliation:
1. Department of Information Systems, University of Maryland, Baltimore County, Baltimore, 21250, USA
2. Addiction Recovery Research Center, Virginia Tech Carilion School of Medicine and Research Institute, Roanoke, VA 24016, USA
Abstract
Social media contain rich information that can be used to help understand human mind and behavior. Social media data, however, are mostly unstructured (e.g., text and image) and a large number of features may be needed to represent them (e.g., we may need millions of unigrams to represent social media texts). Moreover, accurately assessing human behavior is often difficult (e.g., assessing addiction may require medical diagnosis). As a result, the ground truth data needed to train a supervised human behavior model are often difficult to obtain at a large scale. To avoid overfitting, many state-of-the-art behavior models employ sophisticated unsupervised or self-supervised machine learning methods to leverage a large amount of unsupervised data for both feature learning and dimension reduction. Unfortunately, despite their high performance, these advanced machine learning models often rely on latent features that are hard to explain. Since understanding the knowledge captured in these models is important to behavior scientists and public health providers, we explore new methods to build machine learning models that are not only accurate but also interpretable. We evaluate the effectiveness of the proposed methods in predicting Substance Use Disorders (SUD). We believe the methods we proposed are general and applicable to a wide range of data-driven human trait and behavior analysis applications.
Publisher
World Scientific Pub Co Pte Lt
Subject
Artificial Intelligence,Artificial Intelligence
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献