Automatic magnetic resonance imaging series labelling for large repositories
Author:
Maya Armando Gomis1, Alberich Leonor Cerda1, Canuto Diana Veiga1, Faggioni Lorenzo2, Ten Amadeo1, Ribas Gloria1, Mallol Pedro1, Vila-Frances Joan3, Martí-Bonmatí Luis4
Affiliation:
1. Instituto de Investigación Sanitaria La Fe 2. University of Pisa 3. University of Valencia 4. Hospital Universitari i Politècnic La Fe
Abstract
Abstract
Large medical image repositories present challenges related to unstructured data. A data enrichment process allows the storage of additional information for fast identification of the content and properties of medical imaging studies. The aim of this study is to develop a metadata enrichment pipeline to facilitate the secondary use of medical images in a high-throughput environment. Our aim was to develop a categorization tool for the MR series to generate standardized tags that identify relevant image characteristics such as patient orientation, sequence type, weighting type, or the presence of fat suppression.
Three models that make use of machine learning (ML) and DICOM tags are proposed. The dataset for their development consists of 4,666 MR series from cancer patients, labeled by expert radiologists and acquired from different manufacturers, clinical centers, and anatomical regions, covering as much variability as possible with the aim of making the models generalizable to other databases. Moreover, the inference performance of the end system has been evaluated on 25,596 MR series as well as the final model outputs with an external evaluation set of 1,286 MR series.
The weighting model achieves very reliable results with a macro f1-score of 0.88 in the validation set. Junk and chemical shift models achieved scores of 0.82 and 0.83respectively. These results open the door to the automatic application of image post-processing and deep learning algorithms after accurate labeling, minimizing human intervention. Furthermore, the proposed solution can infer thousands of DICOM series in less than 1 minute. Thanks to the fast inference times provided by this solution, it fits well in a big data ecosystem, eliminating any performance issues on ingestion in a semi-real-time environment.
Publisher
Research Square Platform LLC
Reference31 articles.
1. Cerdá L, Alberich et al. «A Confidence Habitats Methodology in MR Quantitative Diffusion for the Classification of Neuroblastic Tumors», Cancers, vol. 12, n.o 12, Art. n.o 12, dic. 2020, 10.3390/cancers12123858. 2. Rodríguez-Ortega A et al. sep., «Machine Learning‐Based Integration of Prognostic Magnetic Resonance Imaging Biomarkers for Myometrial Invasion Stratification in Endometrial Cancer», J. Magn. Reson. Imaging, vol. 54, n.o 3, pp. 987–995, 2021, 10.1002/jmri.27625. 3. Suter Y. «Radiomics for glioblastoma survival analysis in pre-operative MRI: exploring feature robustness, class boundaries, and machine learning techniques», p. 13, 2020. 4. Scapicchio C, Gabelloni M, Barucci A, Cioni D, Saba L, Neri yE. «A deep look into radiomics», Radiol. Med. (Torino), vol. 126, n.o 10, pp. 1296–1311, oct. 2021, 10.1007/s11547-021-01389-x. 5. Martí-Bonmatí L et al. «PRIMAGE project: predictive in silico multiscale analytics to support childhood cancer personalised evaluation empowered by imaging biomarkers», Eur. Radiol. Exp., vol. 4, n.o 1, p. 22, abr. 2020, 10.1186/s41747-020-00150-9.
|
|