Multi-Label Classification in Anime Illustrations Based on Hierarchical Attribute Relationships
Author:
Lan Ziwen1, Maeda Keisuke2ORCID, Ogawa Takahiro2ORCID, Haseyama Miki2
Affiliation:
1. Graduate School of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Hokkaido, Japan 2. Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Hokkaido, Japan
Abstract
In this paper, we propose a hierarchical multi-modal multi-label attribute classification model for anime illustrations using a graph convolutional network (GCN). Our focus is on the challenging task of multi-label attribute classification, which requires capturing subtle features intentionally highlighted by creators of anime illustrations. To address the hierarchical nature of these attributes, we leverage hierarchical clustering and hierarchical label assignments to organize the attribute information into a hierarchical feature. The proposed GCN-based model effectively utilizes this hierarchical feature to achieve high accuracy in multi-label attribute classification. The contributions of the proposed method are as follows. Firstly, we introduce GCN to the multi-label attribute classification task of anime illustrations, enabling the capturing of more comprehensive relationships between attributes from their co-occurrence. Secondly, we capture subordinate relationships among the attributes by adopting hierarchical clustering and hierarchical label assignment. Lastly, we construct a hierarchical structure of attributes that appear more frequently in anime illustrations based on certain rules derived from previous studies, which helps to reflect the relationships between different attributes. The experimental results on multiple datasets show that the proposed method is effective and extensible by comparing it with some existing methods, including the state-of-the-art method.
Subject
Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry
Reference47 articles.
1. Yang, G., Fei, N., Ding, M., Liu, G., Lu, Z., and Xiang, T. (2021, January 20–25). L2M-GAN: Learning To Manipulate Latent Space Semantics for Facial Attribute Editing. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA. 2. Zhang, L., Li, C., JI, Y., Liu, C., and Wong, T.T. (2020, January 23–28). Erasing Appearance Preservation in Optimization-based Smoothing. Proceedings of the European Conference on Computer Vision, Glasgow, UK. 3. Xu, S., Dutta, V., He, X., and Matsumaru, T. (2022). A Transformer-Based Model for Super-Resolution of Anime Image. Sensors, 22. 4. Back, J. (2021). Fine-Tuning StyleGAN2 For Cartoon Face Generation. arXiv. 5. Back, J., Kim, S., and Ahn, N. (2022). WebtoonMe: A Data-Centric Approach for Full-Body Portrait Stylization. arXiv.
|
|