Abstract
Earthquakes lead to enormous harm to life and assets. The ability to quickly assess damage across a vast area is crucial for effective disaster response. In recent years, social networks have demonstrated a lot of capability for improving situational awareness and identifying impacted areas. In this regard, this study proposed an approach that applied social media data for the earthquake damage assessment at the county, city, and 10 × 10 km grids scale using Naive Bayes, support vector machine (SVM), and deep learning classification algorithms. In this study, classification was evaluated using accuracy, precision, recall, and F-score metrics. Then, for understanding the message propagation behavior in the study area, temporal analysis based on classified messages was performed. In addition, variability of spatial topic concentration in three classification algorithms after the earthquake was examined using location quotation (LQ). A damage map based on the results of the classification of the three algorithms into three scales was created. For validation, confusion matrix metrics, Spearman’s rho, Pearson correlation, and Kendall’s tau were used. In this study, binary classification and multi-class classification have been done. Binary classification was used to classify messages into two classes of damage and non-damage so that their results could finally be used to estimate the earthquake damage. Multi-class classification was used to categorize messages to increase post-crisis situational awareness. In the binary classification, the SVM algorithm performed better in all the indices, gaining 71.22% accuracy, 81.22 F-measure, 79.08% accuracy, 85.62% precision, and 0.634 Kappa. In the multi-class classification, the SVM algorithm performed better in all the indices, gaining 90.25% accuracy, 88.58% F-measure, 84.34% accuracy, 93.26% precision, and 0.825 Kappa. Based on the results of the temporal analysis, most of the damage-related messages were reported on the day of the earthquake and decreased in the following days. Most of the messages related to infrastructure damages and injured, dead, and missing people were reported on the day of the earthquake. In addition, results of LQ indicated Napa as a center of the earthquake as the concentration of damage-related messages in all algorithms were based there. This indicates that our approach has been able to identify the damage well and has considered the earthquake center one of the most affected counties. The findings of the damage estimation showed that going away from the epicenter lowered the amount of damage. Based on the result of the validation of the estimated damage map with official data, the SVM performed better for damage estimation, followed by deep learning. In addition, at the county scale, algorithms showed better performance with Spearman’s rho of 0.8205, Pearson correlation of 0.5217, and Kendall’s tau of 0.6666.
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science