Examining Analytic Practices in Latent Dirichlet Allocation Within Psychological Science: Scoping Review-Reference-Cited by-同舟云学术

Examining Analytic Practices in Latent Dirichlet Allocation Within Psychological Science: Scoping Review

Published:2022-11-08 Issue:11 Volume:24 Page:e33166
ISSN:1438-8871
Container-title:Journal of Medical Internet Research
language:en
Short-container-title:J Med Internet Res

Author:

Hagg Lauryn J^ORCID,Merkouris Stephanie S^ORCID,O’Dea Gypsy A^ORCID,Francis Lauren M^ORCID,Greenwood Christopher J^ORCID,Fuller-Tyszkiewicz Matthew^ORCID,Westrupp Elizabeth M^ORCID,Macdonald Jacqui A^ORCID,Youssef George J^ORCID

Abstract

Background Topic modeling approaches allow researchers to analyze and represent written texts. One of the commonly used approaches in psychology is latent Dirichlet allocation (LDA), which is used for rapidly synthesizing patterns of text within “big data,” but outputs can be sensitive to decisions made during the analytic pipeline and may not be suitable for certain scenarios such as short texts, and we highlight resources for alternative approaches. This review focuses on the complex analytical practices specific to LDA, which existing practical guides for training LDA models have not addressed. Objective This scoping review used key analytical steps (data selection, data preprocessing, and data analysis) as a framework to understand the methodological approaches being used in psychology research using LDA. Methods A total of 4 psychology and health databases were searched. Studies were included if they used LDA to analyze written words and focused on a psychological construct or issue. The data charting processes were constructed and employed based on common data selection, preprocessing, and data analysis steps. Results A total of 68 studies were included. These studies explored a range of research areas and mostly sourced their data from social media platforms. Although some studies reported on preprocessing and data analysis steps taken, most studies did not provide sufficient detail for reproducibility. Furthermore, the debate surrounding the necessity of certain preprocessing and data analysis steps is revealed. Conclusions Our findings highlight the growing use of LDA in psychological science. However, there is a need to improve analytical reporting standards and identify comprehensive and evidence-based best practice recommendations. To work toward this, we developed an LDA Preferred Reporting Checklist that will allow for consistent documentation of LDA analytic decisions and reproducible research outcomes.

Publisher

JMIR Publications Inc.

Subject

Health Informatics

Reference145 articles.

1. Big Data: A Survey

2. Discovering implicit activity preferences in travel itineraries by topic modeling

3. How Digital Are the Digital Humanities? An Analysis of Two Scholarly Blogging Platforms

5. Jumping NLP Curves: A Review of Natural Language Processing Research [Review Article]

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Assessing the needs of patients with breast cancer and their families across various treatment phases using a Latent Dirichlet Allocation model: a text-mining approach to online health communities;Supportive Care in Cancer;2024-04-29

2. The global patent landscape of artificial intelligence applications for cancer;Nature Biotechnology;2023-12

3. Research on Topic Identification of Safety Hazard Information in Oilfield Enterprises;2023 12th International Conference on Computing and Pattern Recognition;2023-10-27

4. Alternative Dispute Resolution Research Landscape from 1981 to 2022;Group Decision and Negotiation;2023-08-09

5. Reducing Human Effort in Keyphrase-Based Human-in-the-Loop Topic Models: A Method for Keyphrase Recommendations;Information Integration and Web Intelligence;2023