Author:
Franceschini Rachele,Rosi Ascanio,Catani Filippo,Casagli Nicola
Abstract
Abstract
Background
Mass media are a new and important source of information for any natural disaster, mass emergency, pandemic, economic or political event, or extreme weather event affecting one or more communities in a country. Several techniques have been developed for data mining in social media for many natural events, but few of them have been applied to the automatic extraction of landslide events. In this study, Twitter has been investigated to detect data about landslide events in Italian-language. The main aim is to obtain an automatic text classification on the basis of information about natural hazards. The text classification for landslide events in Italian-language has still not been applied to detect this type of natural hazard.
Results
Over 13,000 data were extracted within Twitter considering five keywords referring to landslide events. The dataset was classified manually, providing a solid base for applying deep learning. The combination of BERT + CNN has been chosen for text classification and two different pre-processing approaches and bert-model have been applied. BERT-multicase + CNN without preprocessing archived the highest values of accuracy, equal to 96% and AUC of 0.96.
Conclusions
Two advantages resulted from this studio: the Italian-language classified dataset for landslide events fills that present gap of analysing natural events using Twitter. BERT + CNN was trained to detect this information and proved to be an excellent classifier for the Italian language for landslide events.
Publisher
Springer Science and Business Media LLC
Reference88 articles.
1. Alaparthi S, Mishra M (2021) BERT: a sentiment analysis odyssey. J Mark Anal 9(2):118–126
2. Avvisati G, Sessa EB, Bellucci E, Colucci O, Marfè B, Marotta E, Nave R, Peluso R, Ricci T (2019) Tomasone M (2019) Perception of risk for natural hazards in Campania Region (Southern Italy). Int J Dis Risk Red 40:101164
3. Barman R, Ehrmann M, Clematide S, Oliveira SA, Kaplan F (2021) Combining visual and textual features for semantic segmentation of historical newspapers. J Data Min Digit Humanit
4. Battistini A, Segoni S, Manzo G, Catani F, Casagli N (2013) Web data mining for automatic inventory of geohazards at national scale. Appl Geogr 147–158.
5. Biolchi S, Denamiel C, Devoto S, Korbar T, Macovaz V, Scicchitano G et al (2019) Impact of the October 2018 storm Vaia on coastal boulders in the northern Adriatic Sea. Water 11(11):2229