Affiliation:
1. Department of Radiology University of Pittsburgh Pittsburgh Pennsylvania USA
2. Department of Bioengineering University of Pittsburgh Pittsburgh Pennsylvania USA
Abstract
AbstractBackgroundChest x‐ray is widely utilized for the evaluation of pulmonary conditions due to its technical simplicity, cost‐effectiveness, and portability. However, as a two‐dimensional (2‐D) imaging modality, chest x‐ray images depict limited anatomical details and are challenging to interpret.PurposeTo validate the feasibility of reconstructing three‐dimensional (3‐D) lungs from a single 2‐D chest x‐ray image via Vision Transformer (ViT).MethodsWe created a cohort of 2525 paired chest x‐ray images (scout images) and computed tomography (CT) acquired on different subjects and we randomly partitioned them as follows: (1) 1800 ‐ training set, (2) 200 ‐ validation set, and (3) 525 ‐ testing set. The 3‐D lung volumes segmented from the chest CT scans were used as the ground truth for supervised learning. We developed a novel model termed XRayWizard that employed ViT blocks to encode the 2‐D chest x‐ray image. The aim is to capture global information and establish long‐range relationships, thereby improving the performance of 3‐D reconstruction. Additionally, a pooling layer at the end of each transformer block was introduced to extract feature information. To produce smoother and more realistic 3‐D models, a set of patch discriminators was incorporated. We also devised a novel method to incorporate subject demographics as an auxiliary input to further improve the accuracy of 3‐D lung reconstruction. Dice coefficient and mean volume error were used as performance metrics as the agreement between the computerized results and the ground truth.ResultsIn the absence of subject demographics, the mean Dice coefficient for the generated 3‐D lung volumes achieved a value of 0.738 ± 0.091. When subject demographics were included as an auxiliary input, the mean Dice coefficient significantly improved to 0.769 ± 0.089 (p < 0.001), and the volume prediction error was reduced from 23.5 ± 2.7%. to 15.7 ± 2.9%.ConclusionOur experiment demonstrated the feasibility of reconstructing 3‐D lung volumes from 2‐D chest x‐ray images, and the inclusion of subject demographics as additional inputs can significantly improve the accuracy of 3‐D lung volume reconstruction.