An insight into racial bias in dermoscopy repositories: A HAM10000 data set analysis

Author:

Morales‐Forero Andres1,Rueda Jaime Lili2,Gil‐Quiñones Sebastian Ramiro23ORCID,Barrera Montañez Marlon Y.3ORCID,Bassetto Samuel1,Coatanea Eric4

Affiliation:

1. Mathematics and Industrial Engineering (MAGI) Department Université de Montréal (Polytechnique) Montreal Canada

2. Infectious and Clinical Dermatology Research Group Universidad El Bosque Bogotá Colombia

3. Dermatology Resident Universidad El Bosque Bogotá Colombia

4. Faculty of Engineering and Natural Sciences (ENS) Tampere University Tampere Finland

Abstract

AbstractBackgroundStudies have revealed a lack of representation of skin of colour patients in academic sources of dermatologic diseases, including databases. This visual racism has consequently generated less comfort and confidence among the specialists in the care and attention of this ethnic group, including the opportunity of being correctly diagnosed.ObjectivesTo investigate and uncover potential racial biases in the HAM10000 data set through an exploratory analysis of the dark skin tones representation, the identification of inaccuracies in its documentation, the recognition of relevant skin conditions absent for darker skin and the lack of ethnic diversity variables crucial for validating diagnosis across different skin tones.MethodsAn exploratory examination was conducted to investigate the occurrence of dark skin within the HAM10000 database (housed in a Harvard Dataverse repository), consisting of 10,015 dermoscopic images of skin lesions. A visual depiction encompassing the whole skin tones was generated by sampling four crucial data points from each image and applying the Gray World Algorithm for colour normalization. To confirm the accuracy of the graphical representation, dermatologists validated the pixel sampling process by analysing a randomly selected 10% of the images for each type of skin lesion. This visual representation was produced for the entire data set as well as for each skin lesion type. The study was further enhanced by comparing the skin lesion representation within the HAM10000 data set against documented prevalences of relevant conditions affecting dark skin.ResultsLess than 5% of the images came from dark‐skinned patients. Nevertheless, in about 4.9% of cases, our pixel sampling method might inadvertently capture shadows or dark spots resulting from the imaging device or the lesion itself rather than the individual's actual skin tone. In addition, there are inaccuracies in the data set's claims of diversity and comprehensive coverage, notably the underrepresentation of conditions prevalent in darker skin and the absence of ethnic diversity variables.ConclusionsVisual racism is an issue that needs to be addressed in medical sources of information and education. Image databases and artificial intelligence models need to be nourished with information, including all skin types, to guarantee equal access to opportunities. Furthermore, any instances where conditions affecting people of colour are underrepresented must be meticulously documented and reported to highlight and address these disparities effectively. This is particularly important in dermoscopy imaging, where solely relying on image‐based racial bias analysis is limited. The alteration of the patient's actual skin tone by the dermatoscope's lighting complicates the accurate assessment of racial bias.

Publisher

Wiley

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3