Determining PolyCystic Ovarian Syndrome Severity from Reddit Posts using Topic Modelling and Association Rule Mining

Author:

Selvaraj Santhi,Sundaradhas Selva Nidhyananthan

Abstract

Nowadays social media plays a vital role in various real-time applications, especially in healthcare applications. PolyCystic Ovarian Syndrome (PCOS) is a condition that affects females between the ages of 15 and 35 who are of reproductive potential. The symptoms of PCOS are hormonal issues, irregular periods, weight gain, follicles, infertility, excessive hair growth in the skin, hair loss, acne, pimples, dark scars, and depression. Most of the earlier researchers analyzed the PCOS based on clinical text and health records using a machine learning approach. The main motivation of this proposed work is to predict the upcoming PCOS symptoms based on current symptoms and find the severity of the PCOS from Reddit users. This is done by collecting head symptoms from Gynecologists, gathering present symptoms from Reddit users, collecting unstructured data is pre-processed and PCOS sub symptoms are extracted using Bag of Words. The sub symptoms are mapped into head symptoms using Latent Dirichlet Allocation (LDA) for dimension reduction. The major issue in that approach is a single user has experienced the same type of symptom multiple times. This issue is solved by implementing a novel method called Symptom Segmentation and grouping Labeled Latent Dirichlet Allocation (SSG_LLDA) is designed to reduce the dimensionality and map the social media users sub symptoms into head symptoms. Association Rule Mining (ARM) with Apriori is employed to produce the frequent symptoms, and effective rule sets, and form the distinctive symptom patterns. Among several mini-mum support and confidence metrics, 0.02 and 0.1 delivers the best rule sets and symptom patterns. Based on rulesets of symptom patterns and combinations, the severity of PCOS is determined for Reddit users. The novelty of this work is the construction of PCOS symptom patterns from topic modelling results instead of original data so the dimensionality of the features is reduced and more scalable

Publisher

Zarqa University

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3