Multi-Label Classification in Anime Illustrations Based on Hierarchical Attribute Relationships-Reference-Cited by-同舟云学术

Multi-Label Classification in Anime Illustrations Based on Hierarchical Attribute Relationships

Published:2023-05-16 Issue:10 Volume:23 Page:4798
ISSN:1424-8220
Container-title:Sensors
language:en
Short-container-title:Sensors

Author:

Lan Ziwen¹,Maeda Keisuke²^ORCID,Ogawa Takahiro²^ORCID,Haseyama Miki²

Affiliation:

1. Graduate School of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Hokkaido, Japan

2. Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Hokkaido, Japan

Abstract

In this paper, we propose a hierarchical multi-modal multi-label attribute classification model for anime illustrations using a graph convolutional network (GCN). Our focus is on the challenging task of multi-label attribute classification, which requires capturing subtle features intentionally highlighted by creators of anime illustrations. To address the hierarchical nature of these attributes, we leverage hierarchical clustering and hierarchical label assignments to organize the attribute information into a hierarchical feature. The proposed GCN-based model effectively utilizes this hierarchical feature to achieve high accuracy in multi-label attribute classification. The contributions of the proposed method are as follows. Firstly, we introduce GCN to the multi-label attribute classification task of anime illustrations, enabling the capturing of more comprehensive relationships between attributes from their co-occurrence. Secondly, we capture subordinate relationships among the attributes by adopting hierarchical clustering and hierarchical label assignment. Lastly, we construct a hierarchical structure of attributes that appear more frequently in anime illustrations based on certain rules derived from previous studies, which helps to reflect the relationships between different attributes. The experimental results on multiple datasets show that the proposed method is effective and extensible by comparing it with some existing methods, including the state-of-the-art method.

Funder

JSPS KAKENHI

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry

Link

https://www.mdpi.com/1424-8220/23/10/4798/pdf

Reference47 articles.

1. Yang, G., Fei, N., Ding, M., Liu, G., Lu, Z., and Xiang, T. (2021, January 20–25). L2M-GAN: Learning To Manipulate Latent Space Semantics for Facial Attribute Editing. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.

2. Zhang, L., Li, C., JI, Y., Liu, C., and Wong, T.T. (2020, January 23–28). Erasing Appearance Preservation in Optimization-based Smoothing. Proceedings of the European Conference on Computer Vision, Glasgow, UK.

3. Xu, S., Dutta, V., He, X., and Matsumaru, T. (2022). A Transformer-Based Model for Super-Resolution of Anime Image. Sensors, 22.

4. Back, J. (2021). Fine-Tuning StyleGAN2 For Cartoon Face Generation. arXiv.

5. Back, J., Kim, S., and Ahn, N. (2022). WebtoonMe: A Data-Centric Approach for Full-Body Portrait Stylization. arXiv.