Real-time pneumonia prediction using pipelined spark and high-performance computing-Reference-Cited by-同舟云学术

Real-time pneumonia prediction using pipelined spark and high-performance computing

Published:2023-03-09 Issue: Volume:9 Page:e1258
ISSN:2376-5992
Container-title:PeerJ Computer Science
language:en
Short-container-title:

Author:

Ravikumar Aswathy^ORCID,Sriraman Harini

Abstract

Background Pneumonia is a respiratory disease caused by bacteria; it affects many people, particularly in impoverished countries where pollution, unclean living standards, overpopulation, and insufficient medical infrastructures are prevalent. To guarantee curative therapy and boost survival chances, it is vital to detect pneumonia soon enough. Imaging using chest X-rays is the most common way of detecting pneumonia. However, analyzing chest X-rays is a complex process vulnerable to subjective variation. Moreover, the data available is growing exponentially, and it will take hours and days to train the model to predict pneumonia. Timely prediction is significant to guarantee a better cure and treatment. Existing work provided by different authors needs more precision, and the computation time for predicting pneumonia is also much longer. Therefore, there is a requirement for early forecasting. Using X-ray picture samples, the system must have a continuous and unsupervised learning system for early diagnosis. Methods In this article, the training time of the model is accelerated using the distributed data-parallel approach and the computational power of high-performance computing devices. This research aims to diagnose pneumonia using X-ray pictures with more precision, greater speed, and fewer processing resources. Distributed deep learning techniques are gaining popularity owing to the rising need for computational resources for deep learning models with several parameters. In contrast to conventional training methods, data-parallel training enables several compute nodes to train massive deep-learning models to improve training efficiency concurrently. Deploying the model in Spark solves the scalability and acceleration. Spark’s distributed processing capability reads data from multiple nodes, and the results demonstrate that training time can be drastically reduced by utilizing these techniques, which is a significant necessity when dealing with large datasets. Results The proposed model makes the prediction 1.5 times faster than the traditional CNN model used for pneumonia prediction. The model also achieved an accuracy of 98.72%. The speed-up varying from 1.2 to 1.5 was obtained in the synchronous and asynchronous parallel model. The speed-up is reduced in the parallel asynchronous model due to the presence of straggler nodes.

Publisher

PeerJ

Subject

General Computer Science

Link

https://peerj.com/articles/cs-1258.pdf

Reference45 articles.

1. Pneumonia transfer learning deep learning model from segmented X-rays;Alharbi;Healthcare,2022

2. Federated learning for privacy preservation in smart healthcare systems: a comprehensive survey;Ali;IEEE Journal of Biomedical and Health Informatics,2022

3. Large-scale distributed neural network training through online distillation;Anil;ArVix preprint,2020

4. Overview—Spark 3.3.0 documentation;Apache Spark,2022

5. Using LSTM networks to predict engine condition on large scale data processing framework;Aydin;2017 4th International Conference on Electrical and Electronic Engineering (ICEEE),2017

Cited by 9 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Data Analytics to Forecast Brain Cancer Risk and Suitable Digital Solutions in India;Advances in Healthcare Information Systems and Administration;2024-06-30

2. Circumventing Stragglers and Staleness in Distributed CNN using LSTM;EAI Endorsed Transactions on Internet of Things;2024-02-14

3. Statistical Methods for Performance Analysis of Data Processing Systems in High-Performance Computing Environments;2024 International Conference on Optimization Computing and Wireless Communication (ICOCWC);2024-01-29

4. DPro-SM – A distributed framework for proactive straggler mitigation using LSTM;Heliyon;2024-01

5. Dynamic Clustering Strategies Boosting Deep Learning in Olive Leaf Disease Diagnosis;Sustainability;2023-09-14