Abstract
Structured AbstractObjectiveTo evaluate accuracy and reproducibility of 2D echocardiography (2DE) left ventricular (LV) volumes and ejection fraction (LVEF) estimates by Deep Learning (DL) vs. manual contouring and against CMR.Background2DE LV manual segmentation for LV volumes and LVEF calculation is time consuming and operator dependent.MethodsA DL-based convolutional network (DL1) was trained on 2DE data from centre A, then evaluated on 171 subjects with a wide range of cardiac conditions (49 healthy) – 31 subjects from centre A (18%) and 140 subjects from centre B (82%) – who underwent 2DE and CMR on the same day. Two senior (A1 and B1) and one junior (A2) cardiologists manually contoured 2DE end-diastolic (ED) and end-systolic (ES) endocardial borders in the cycle and frames of their choice. Selected frames were automatically segmented by DL1 and two DL algorithms from the literature (DL2 and DL3), applied without adaptation to verify their generalizability to unseen data. Interobserver variability of DL was compared to manual contouring. All ESV, EDV and EF values were compared to CMR as reference.Results50% of 2DE images were of good quality. Interobserver agreement was better by DL1 and DL2 than by manual contouring for EF (Lin’s concordance = 0.9 and 0.91 vs. 0.84), EDV (0.98 and 0.99 vs. 0.82), and ESV (0.99 and 0.99 vs. 0.89). LVEF bias was similar or reduced using DL1 (-0.1) vs. manual contouring (3.0), and worse for DL2 and DL3. Agreement between 2DE and CMR LVEF was similar or higher for DL1 vs. manual contouring (Cohen’s kappa = 0.65 vs. 0.61) and degraded for DL2 and DL3 (0.48 and 0.29).ConclusionDL contouring yielded accurate EF measurements and generalized well to unseen data, while reducing interobserver variability. This suggests that DL contouring may improve accuracy and reproducibility of 2DE LVEF in routine practice.
Publisher
Cold Spring Harbor Laboratory