Author:
Kozłowski Marek,Buczkowski Przemyslaw,Brzezinski Piotr
Abstract
AbstractThe industrialisation of the footwear recycling processes is a major issue in the European Union—particularly in view of the fact that at least 90% of shoes consumed in western economies are ultimately sent to landfill. This requires new AI-empowered technologies that enable detection, classification, pairing, and quality assessment in a viable automatic process. This article discusses automatic shoe pairing, which comprises two sequential stages: a) deep multiview shoe embedding (compact representation of multiview data); and b) clustering of shoes’ embeddings with a fixed similarity threshold to return sets of possible pairs. Each shoe in our pipeline is represented by multiple images that are collected in industrial darkrooms. We present various approaches to shoe pairing—from fully unsupervised ones based on image descriptors to supervised ones that rely on deep neural networks—to identify the most effective one for this highly specific industrial task. The article also explains how the selected method can be improved by hyperparameter tuning, massive increases in training data, and data augmentation.
Publisher
Springer Nature Switzerland
Reference15 articles.
1. Brunet, D., Vrscay, E.R., Wang, Z.: On the mathematical properties of the structural similarity index. IEEE Trans. Image Process. 21(4), 1488–1499 (2011)
2. Chao, G., Sun, S., Bi, J.: A survey on multiview clustering. IEEE Trans. Artif. Intell. 2(2), 146–168 (2021)
3. Chi, M., Zhang, P., Zhao, Y., Feng, R., Xue, X.: Web image retrieval reranking with multi-view clustering. In: Proceedings of the 18th International Conference on World Wide Web, pp. 1189–1190 (2009)
4. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 1, pp. 886–893. IEEE (2005)
5. Djelouah, A., Franco, J.S., Boyer, E., Le Clerc, F., Pérez, P.: Multi-view object segmentation in space and time. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 2640–2647 (2013)