Affiliation:
1. Department of Radiation Oncology Emory University Atlanta Georgia USA
2. School of Mechanical Engineering Georgia Institute of Technology Atlanta Georgia USA
3. Winship Cancer Institute Emory University Atlanta Georgia USA
4. Department of Medical Physics Memorial Sloan Kettering Cancer Center New York New York USA
5. Department of Radiology and Imaging Sciences Emory University Atlanta Georgia USA
Abstract
AbstractBackground7 Tesla (7T) apparent diffusion coefficient (ADC) maps derived from diffusion‐weighted imaging (DWI) demonstrate improved image quality and spatial resolution over 3 Tesla (3T) ADC maps. However, 7T magnetic resonance imaging (MRI) currently suffers from limited clinical unavailability, higher cost, and increased susceptibility to artifacts.PurposeTo address these issues, we propose a hybrid CNN‐transformer model to synthesize high‐resolution 7T ADC maps from multimodal 3T MRI.MethodsThe Vision CNN‐Transformer (VCT), composed of both Vision Transformer (ViT) blocks and convolutional layers, is proposed to produce high‐resolution synthetic 7T ADC maps from 3T ADC maps and 3T T1‐weighted (T1w) MRI. ViT blocks enabled global image context while convolutional layers efficiently captured fine detail. The VCT model was validated on the publicly available Human Connectome Project Young Adult dataset, comprising 3T T1w, 3T DWI, and 7T DWI brain scans. The Diffusion Imaging in Python library was used to compute ADC maps from the DWI scans. A total of 171 patient cases were randomly divided into 130 training cases, 20 validation cases, and 21 test cases. The synthetic ADC maps were evaluated by comparing their similarity to the ground truth volumes with the following metrics: peak signal‐to‐noise ratio (PSNR), structural similarity index measure (SSIM), and mean squared error (MSE). In addition,ResultsThe results are as follows: PSNR: 27.0 ± 0.9 dB, SSIM: 0.945 ± 0.010, and MSE: 2.0E‐3 ± 0.4E‐3. Both qualitative and quantitative results demonstrate that VCT performs favorably against other state‐of‐the‐art methods. We have introduced various efficiency improvements, including the implementation of flash attention and training on 176×208 resolution images. These enhancements have resulted in the reduction of parameters and training time per epoch by 50% in comparison to ResViT. Specifically, the training time per epoch has been shortened from 7.67 min to 3.86 min.ConclusionWe propose a novel method to predict high‐resolution 7T ADC maps from low‐resolution 3T ADC maps and T1w MRI. Our predicted images demonstrate better spatial resolution and contrast compared to 3T MRI and prediction results made by ResViT and pix2pix. These high‐quality synthetic 7T MR images could be beneficial for disease diagnosis and intervention, producing higher resolution and conformal contours, and as an intermediate step in generating synthetic CT for radiation therapy, especially when 7T MRI scanners are unavailable.
Funder
National Institutes of Health