Forecasting of infection prevalence of Helicobacter pylori (H. pylori) using regression analysis

Author:

Usarov KomiljonORCID,Ahmedov Anvarjon,Abasiyanik Mustafa FatihORCID,Ku Khalif Ku Muhammad Na’im

Abstract

Global warming may have a significant impact on human health because of the growth of the population of harmful bacteria such as Helicobacter pylori infection. It is crucial to predict the prevalence of a pathogen in a society in a faster and more cost-effective way in order to manage caused disease. In this research, we have done predictive analysis of H. pylori infection spread behavior with respect to weather parameters (e.g., humidity, dew point, temperature, pressure, and wind speed) of Istanbul based on a database from Istanbul Samatya Hospital. We developed a forecasting model to predict H. pylori infection prevalence. The goal is to develop a machine learning model to predict H. pylori (Hp) related infection diseases (e.g., gastric ulcer diseases, gastritis) based on climate variables. The dataset for this study covered years from 1999 to 2003 and contained a total of 7014 rows from the Samatya Hospital in Istanbul.  The weather information related to those years and location, including humidity (H), dew point (D), temperature (T), pressure (P) and wind speed (W), were collected from the following website: https://www.wunderground.com. In this paper we analyzed the forecasting model, which was used to predict H. pylori infection prevalence, by non-linear multivariate linear regression model (MLRM). We applied the non-linear least square method of minimization for the sum of squares to find optimal parameters of MLRM. Multiple Regression Method was used to determine the correlation between a criterion variable and a combination of predictor variables. It was established that the Hp infection disease is most influenced by humidity. Hp prevalence is modelled using the Multiple Regression Method equation, the average H, D, T, P, and W were the most important parameters to deviation of the datasets (testing dataset was 17% and 18% for training dataset). This showed that the statistical model predicts the Hp prevalence with about 83% accuracy of the testing data set (11 months) and 87% accuracy of the training data set (42 months). Based on the proposed model, monthly infection can be predicted early for medical services to take preventative measures and for government to prepare against the bacteria. In addition, drug producers can adjust their drug production rates based on forecasting results.   ABSTRAK: Pemanasan global mungkin mempunyai kesan langsung terhadap kesihatan manusia kerana pertambahan populasi bakteria merbahaya seperti infeksi H. pylori. Adalah penting bagi mengesan kehadiran patogen dalam masyarakat bagi mengawal penularan penyakit dengan cepat, dan melalui kaedah kurang mahal. Kajian ini berkaitan analisis ramalan penularan infeksi H. pylori secara langsung terhadap parameter cuaca (cth: kelembapan, titik embun, suhu, tekanan, kelajuan angin) di Istanbul berdasarkan data dari Hospital Samatya Istanbul. Kajian ini membentuk model ramalan bagi menjangka penyebaran infeksi H. pylori. Matlamat adalah bagi mencipta model pembelajaran mesin bagi mengjangka penyakit berkaitan infeksi H. pylori (Hp) (cth: penyakit ulser gastrik, gastrik) berdasarkan pembolehubah cuaca. Dari tahun 1999 ke 2003, set data telah digunakan bagi mempelajari di mana sejumlah 7014 baris dari Hospital Samatya di Istanbul. Informasi berkaitan tahun-tahun tersebut dan lokasi mengenai kelembapan (H), titik embun (D), suhu (T), tekanan (P) dan kelajuan angin (W) dikumpul dari laman sesawang https://www.wunderground.com. Kajian ini mengguna pakai model ramalan bagi meramal kelaziman infeksi H. pylori, melalui model regresi berkadaran multivariat tidak-berkadaran (MLRM). Kaedah Kuasa Dua Terkecil tidak linear digunakan bagi pengurangan jumlah ganda dua bagi mencapai parameter optimum MLRM. Kaedah Regresi Gandaan digunakan bagi mencari persamaan antara kriteria pembolehubah dan gabungan pembolehubah ramalan. Dapatan menunjukkan infeksi penyakit Hp adalah disebabkan oleh faktor kelembapan. Penyebaran Hp dimodel menggunakan persamaan Kaedah Regresi Gandaan, purata H, D, T, P dan W adalah parameter terpenting bagi sisihan data latihan iaitu sebanyak 17% dan 18% bagi set data latihan. Ini menunjukkan model statistik menjangkakan penyebaran Hp adalah sebanyak 83% adalah tepat pada set data yang diuji (selama 11 bulan) dan 87% tepat pada set data latihan (selama 42 bulan). Berdasarkan model yang dicadangkan ini, infeksi bulanan dapat di jangka lebih awal bagi membendung servis kepada perubatan dan kerajaan bersiap-sedia memerangi bakteria ini. Tambahan, prosedur jumlah ubatan dapat dihasilkan lebih atau kurang daripada jumlah ubatan berdasarkan dapatan ramalan.

Publisher

IIUM Press

Subject

Applied Mathematics,General Engineering,General Chemical Engineering,General Computer Science

Cited by 2 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3