Real-time pneumonia prediction using pipelined spark and high-performance computing

Author:

Ravikumar AswathyORCID,Sriraman Harini

Abstract

Background Pneumonia is a respiratory disease caused by bacteria; it affects many people, particularly in impoverished countries where pollution, unclean living standards, overpopulation, and insufficient medical infrastructures are prevalent. To guarantee curative therapy and boost survival chances, it is vital to detect pneumonia soon enough. Imaging using chest X-rays is the most common way of detecting pneumonia. However, analyzing chest X-rays is a complex process vulnerable to subjective variation. Moreover, the data available is growing exponentially, and it will take hours and days to train the model to predict pneumonia. Timely prediction is significant to guarantee a better cure and treatment. Existing work provided by different authors needs more precision, and the computation time for predicting pneumonia is also much longer. Therefore, there is a requirement for early forecasting. Using X-ray picture samples, the system must have a continuous and unsupervised learning system for early diagnosis. Methods In this article, the training time of the model is accelerated using the distributed data-parallel approach and the computational power of high-performance computing devices. This research aims to diagnose pneumonia using X-ray pictures with more precision, greater speed, and fewer processing resources. Distributed deep learning techniques are gaining popularity owing to the rising need for computational resources for deep learning models with several parameters. In contrast to conventional training methods, data-parallel training enables several compute nodes to train massive deep-learning models to improve training efficiency concurrently. Deploying the model in Spark solves the scalability and acceleration. Spark’s distributed processing capability reads data from multiple nodes, and the results demonstrate that training time can be drastically reduced by utilizing these techniques, which is a significant necessity when dealing with large datasets. Results The proposed model makes the prediction 1.5 times faster than the traditional CNN model used for pneumonia prediction. The model also achieved an accuracy of 98.72%. The speed-up varying from 1.2 to 1.5 was obtained in the synchronous and asynchronous parallel model. The speed-up is reduced in the parallel asynchronous model due to the presence of straggler nodes.

Publisher

PeerJ

Subject

General Computer Science

Reference45 articles.

1. Pneumonia transfer learning deep learning model from segmented X-rays;Alharbi;Healthcare,2022

2. Federated learning for privacy preservation in smart healthcare systems: a comprehensive survey;Ali;IEEE Journal of Biomedical and Health Informatics,2022

3. Large-scale distributed neural network training through online distillation;Anil;ArVix preprint,2020

4. Overview—Spark 3.3.0 documentation;Apache Spark,2022

5. Using LSTM networks to predict engine condition on large scale data processing framework;Aydin;2017 4th International Conference on Electrical and Electronic Engineering (ICEEE),2017

Cited by 9 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Data Analytics to Forecast Brain Cancer Risk and Suitable Digital Solutions in India;Advances in Healthcare Information Systems and Administration;2024-06-30

2. Circumventing Stragglers and Staleness in Distributed CNN using LSTM;EAI Endorsed Transactions on Internet of Things;2024-02-14

3. Statistical Methods for Performance Analysis of Data Processing Systems in High-Performance Computing Environments;2024 International Conference on Optimization Computing and Wireless Communication (ICOCWC);2024-01-29

4. DPro-SM – A distributed framework for proactive straggler mitigation using LSTM;Heliyon;2024-01

5. Dynamic Clustering Strategies Boosting Deep Learning in Olive Leaf Disease Diagnosis;Sustainability;2023-09-14

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3