Author:
Nguyen Ha Q.,Lam Khanh,Le Linh T.,Pham Hieu H.,Tran Dat Q.,Nguyen Dung B.,Le Dung D.,Pham Chi M.,Tong Hang T. T.,Dinh Diep H.,Do Cuong D.,Doan Luu T.,Nguyen Cuong N.,Nguyen Binh T.,Nguyen Que V.,Hoang Au D.,Phan Hien N.,Nguyen Anh T.,Ho Phuong H.,Ngo Dat T.,Nguyen Nghia T.,Nguyen Nhan T.,Dao Minh,Vu Van
Abstract
AbstractMost of the existing chest X-ray datasets include labels from a list of findings without specifying their locations on the radiographs. This limits the development of machine learning algorithms for the detection and localization of chest abnormalities. In this work, we describe a dataset of more than 100,000 chest X-ray scans that were retrospectively collected from two major hospitals in Vietnam. Out of this raw data, we release 18,000 images that were manually annotated by a total of 17 experienced radiologists with 22 local labels of rectangles surrounding abnormalities and 6 global labels of suspected diseases. The released dataset is divided into a training set of 15,000 and a test set of 3,000. Each scan in the training set was independently labeled by 3 radiologists, while each scan in the test set was labeled by the consensus of 5 radiologists. We designed and built a labeling platform for DICOM images to facilitate these annotation procedures. All images are made publicly available in DICOM format along with the labels of both the training set and the test set.
Publisher
Springer Science and Business Media LLC
Subject
Library and Information Sciences,Statistics, Probability and Uncertainty,Computer Science Applications,Education,Information Systems,Statistics and Probability
Cited by
115 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献