Affiliation:
1. Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional Tamaulipas Mexico
2. Universidade Estácio de Sá Rio de Janeiro Brazil
3. Programa de Engenharia Biomédica/COPPE Universidade Federal do Rio de Janeiro Rio de Janeiro Brazil
Abstract
AbstractPurposeComputer‐aided diagnosis (CAD) systems on breast ultrasound (BUS) aim to increase the efficiency and effectiveness of breast screening, helping specialists to detect and classify breast lesions. CAD system development requires a set of annotated images, including lesion segmentation, biopsy results to specify benign and malignant cases, and BI‐RADS categories to indicate the likelihood of malignancy. Besides, standardized partitions of training, validation, and test sets promote reproducibility and fair comparisons between different approaches. Thus, we present a publicly available BUS dataset whose novelty is the substantial increment of cases with the above‐mentioned annotations and the inclusion of standardized partitions to objectively assess and compare CAD systems.Acquisition and Validation MethodsThe BUS dataset comprises 1875 anonymized images from 1064 female patients acquired via four ultrasound scanners during systematic studies at the National Institute of Cancer (Rio de Janeiro, Brazil). The dataset includes biopsy‐proven tumors divided into 722 benign and 342 malignant cases. Besides, a senior ultrasonographer performed a BI‐RADS assessment in categories 2 to 5. Additionally, the ultrasonographer manually outlined the breast lesions to obtain ground truth segmentations. Furthermore, 5‐ and 10‐fold cross‐validation partitions are provided to standardize the training and test sets to evaluate and reproduce CAD systems. Finally, to validate the utility of the BUS dataset, an evaluation framework is implemented to assess the performance of deep neural networks for segmenting and classifying breast lesions.Data Format and Usage NotesThe BUS dataset is publicly available for academic and research purposes through an open‐access repository under the name BUS‐BRA: A Breast Ultrasound Dataset for Assessing CAD Systems. BUS images and reference segmentations are saved in Portable Network Graphic (PNG) format files, and the dataset information is stored in separate Comma‐Separated Value (CSV) files.Potential ApplicationsThe BUS‐BRA dataset can be used to develop and assess artificial intelligence‐based lesion detection and segmentation methods, and the classification of BUS images into pathological classes and BI‐RADS categories. Other potential applications include developing image processing methods like despeckle filtering and contrast enhancement methods to improve image quality and feature engineering for image description.
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献