Abstract
AbstractArtificial intelligence-enhanced electrocardiogram (AI-ECG) analysis has the potential to transform care of cardiovascular disease patients. Most algorithms rely on digitised signal data and are unable to analyse paper-based ECGs, which remain in use in numerous clinical settings. An image-based ECG dataset incorporating artefacts common to paper-based ECGs, which are typically scanned or photographed into electronic health records, could facilitate development of clinically useful image-based algorithms. This paper describes the creation of GenECG, a high-fidelity, synthetic image-based dataset containing 21,799 ECGs with artefacts encountered in routine care. Iterative clinical Turing tests confirmed the realism of the synthetic ECGs: expert observer accuracy of discrimination between real-world and synthetic ECGs fell from 63.9% (95% CI 58.0%- 69.8%) to 53.3% (95% CI: 48.6%-58.1%) over three rounds of testing, indicating that observers could not distinguish between synthetic and real ECGs. GenECG is the first publicly available synthetic image-based ECG dataset to pass a clinical Turing test. The dataset will enable image-based AI-ECG algorithm development, ensuring the translation of AI-ECG research developments to the clinical workspace.
Publisher
Cold Spring Harbor Laboratory