Abstract
In Japan, approximately 400 medical-device recalls and more than 10,000 malfunctions are reported each year, leading to 100–200 actual device recalls. Using text mining, we analyzed the medical-device malfunction reports in the Ministry of Health, Labour and Welfare between 2008 and 2023. We targeted 4,529 cases on generators of cardiac implantable electronic devices, including 363 recalled cases. After mining the contents of problem status and health-damage reports, we attempted to estimate cases resulting in recalls using Bidirectional Encoder Representations from Transformers (BERT). For this purpose, we adopted tohoku-BERT, a pre-training model based on Japanese Wikipedia data, UTH-BERT, a pre-training model based on medical records, and JMedRoBERTa, a pre-training model based on medical research papers. We operated a classifier with fine tuning on a dataset annotated with medical-device malfunction reports. The UTH-BERT achieved a recall rate and F2-score of 0.931 and 0.655, respectively, on undersampled data.