Distributed Online Multi-Label Learning with Privacy Protection in Internet of Things
-
Published:2023-02-20
Issue:4
Volume:13
Page:2713
-
ISSN:2076-3417
-
Container-title:Applied Sciences
-
language:en
-
Short-container-title:Applied Sciences
Author:
Huang Fan1ORCID, Yang Nan1, Chen Huaming1ORCID, Bao Wei1, Yuan Dong1
Affiliation:
1. School of Electrical and Information Engineering, The University of Sydney, Sydney, NSW 2008, Australia
Abstract
With the widespread use of end devices, online multi-label learning has become popular as the data generated by users using the Internet of Things devices have become huge and rapidly updated. However, in many scenarios, the user data are often generated in a geographically distributed manner that is often inefficient and difficult to centralize for training machine learning models. At the same time, current mainstream distributed learning algorithms always require a centralized server to aggregate data from distributed nodes, which inevitably causes risks to the privacy of users. To overcome this issue, we propose a distributed approach for multi-label classification, which trains the models in distributed computing nodes without sharing the source data from each node. In our proposed method, each node trains its model with its local online data while it also learns from the neighbour nodes without transferring the training data. As a result, our proposed method achieved the online distributed approach for multi-label classification without losing performance when taking existing centralized algorithms as a reference. Experiments show that our algorithm outperforms the centralized online multi-label classification algorithm in F1 score, being 0.0776 higher in macro F1 score and 0.1471 higher for micro F1 score on average. However, for the Hamming loss, both algorithms beat each other on some datasets, and our proposed algorithm loses 0.005 compared to the centralized approach on average, which can be neglected. Furthermore, the size of the network and the degree of connectivity are not factors that affect the performance of this distributed online multi-label learning algorithm.
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Reference55 articles.
1. Zhang, X., Graepel, T., and Herbrich, R. (2010, January 13–15). Bayesian online learning for multi-label and multi-variate performance measures. Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, JMLR Workshop and Conference Proceedings, Sardinia, Italy. 2. Internet of things for smart cities;Zanella;IEEE Internet Things J.,2014 3. Spyromitros-Xioufis, E., Spiliopoulou, M., Tsoumakas, G., and Vlahavas, I. (2011, January 16–22). Dealing with concept drift and class imbalance in multi-label stream classification. Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, Barcelona, Spain. 4. Büyükçakir, A., Bonab, H., and Can, F. (2018, January 22–26). A novel online stacked ensemble for multi-label stream classification. Proceedings of the 27th ACM International Conference on Information and Knowledge Management, Torino, Italy. 5. Li, P., Wang, H., Böhm, C., and Shao, J. (2021, January 7–15). Online semi-supervised multi-label classification with label compression and local smooth regression. Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, Yokohama, Japan.
|
|