Parallel Network Analysis and Communities Detection (PANC) Pipeline for the Analysis and Visualization of COVID-19 Data-Reference-Cited by-同舟云学术

Parallel Network Analysis and Communities Detection (PANC) Pipeline for the Analysis and Visualization of COVID-19 Data

Published:2021-09-22 Issue: Volume: Page:2142002
ISSN:0129-6264
Container-title:Parallel Processing Letters
language:en
Short-container-title:Parallel Process. Lett.

Author:

Agapito Giuseppe¹^ORCID,Milano Marianna²,Cannataro Mario²

Affiliation:

1. Data Analytics Research Center, Department of Legal, Economic and Social Sciences, Magna Græcia University, Catanzaro Italy 88100, Italy

2. Data Analytics Research Center, Department of Medical and Surgical Sciences, Magna Græcia University, Catanzaro Italy 88100, Italy

Abstract

A new coronavirus, causing a severe acute respiratory syndrome (COVID-19), was started at Wuhan, China, in December 2019. The epidemic has rapidly spread across the world becoming a pandemic that, as of today, has affected more than 70 million people causing over 2 million deaths. To better understand the evolution of spread of the COVID-19 pandemic, we developed PANC (Parallel Network Analysis and Communities Detection), a new parallel preprocessing methodology for network-based analysis and communities detection on Italian COVID-19 data. The goal of the methodology is to analyze set of homogeneous datasets (i.e. COVID-19 data in several regions) using a statistical test to find similar/dissimilar behaviours, mapping such similarity information on a graph and then using community detection algorithm to visualize and analyze the initial dataset. The methodology includes the following steps: (i) a parallel methodology to build similarity matrices that represent similar or dissimilar regions with respect to data; (ii) an effective workload balancing function to improve performance; (iii) the mapping of similarity matrices into networks where nodes represent Italian regions, and edges represent similarity relationships; (iv) the discovering and visualization of communities of regions that show similar behaviour. The methodology is general and can be applied to world-wide data about COVID-19, as well as to all types of data sets in tabular and matrix format. To estimate the scalability with increasing workloads, we analyzed three synthetic COVID-19 datasets with the size of 90.0[Formula: see text]MB, 180.0[Formula: see text]MB, and 360.0[Formula: see text]MB. Experiments was performed on showing the amount of data that can be analyzed in a given amount of time increases almost linearly with the number of computing resources available. Instead, to perform communities detection, we employed the real data set.

Funder

Data Analytics Research Center, Department of Medical and Surgical Sciences, University of Catanzaro

Publisher

World Scientific Pub Co Pte Ltd

Subject

Hardware and Architecture,Theoretical Computer Science,Software

Link

https://www.worldscientific.com/doi/pdf/10.1142/S0129626421420020

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Exploring temporal community evolution: algorithmic approaches and parallel optimization for dynamic community detection;Applied Network Science;2023-09-18

2. A Python Clustering Analysis Protocol of Genes Expression Data Sets;Genes;2022-10-12

3. Application of CCTV Methodology to Analyze COVID-19 Evolution in Italy;BioTech;2022-08-11