Selection of breast features for young women in northwestern China based on the random forest algorithm

Author:

Zhou Jie1ORCID,Mao Qian1ORCID,Zhang Jun2,Lau Newman ML2,Chen Jianming3ORCID

Affiliation:

1. School of Apparel and Art Design, Xi'an Polytechnic University, China

2. School of Design, The Hong Kong Polytechnic University, Hong Kong

3. Department of Biomedical Engineering, The Chinese University of Hong Kong, Hong Kong

Abstract

In the research of breast morphology, numerous breast features are measured, whereas only a few parameters are adopted for classification. Therefore, how to extract the key variables from the multi-dimensional features in a rational way is an issue that is focused upon. This study aimed to reduce the complexity of the dimensionality reduction for further improving the objectivity and interpretability of the selected breast features. Since the random forest (RF) algorithm can quantify the feature importance during training, the method was adopted to determine the optimal breast features for classification and recognition in this paper. Firstly, the anthropometric data of 360 females from northwestern China aged from 19 to 27 years were measured by non-contact three-dimensional body scanning technology and the contact manual measurement method. Then, the k-means clustering was applied to categorize breast shapes, and the RF algorithm was utilized to quantify and rank the importance of 25 breast features. Finally, to verify the availability of the RF algorithm on breast feature selection, the t-distributed stochastic neighbor embedding method was adopted to visualize the distribution of breast shape clusters into two dimensions. Meanwhile, four neural networks were determined to recognize the breast morphology. The results demonstrate that fewer breast features can effectively increase the accuracy of breast shape classification and recognition. The best performance of breast shape classification and recognition is obtained when the number of breast features is 13. In this case, the average Hamming loss of four neural networks is the smallest (0.1136). Interestingly, the bust circumference and the horizontal curve of breasts across the bust points are found to be the most important of the 25 breast features in this paper. The importance of the breast curve features is higher than that of the breast cross-sectional features, while the breast positioning features have the lowest importance. Meanwhile, the RF algorithm is verified to be more effective than traditional dimensionality reduction methods, such as principal component analysis, hierarchical clustering, and recursive feature elimination. The approach developed in this paper can be generalized to the dimensionality reduction of other body morphology.

Funder

Shaanxi Science and Technology Department International Science Technology Cooperation Funding

Publisher

SAGE Publications

Subject

Polymers and Plastics,Chemical Engineering (miscellaneous)

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3