On Evaluating IoT Data Trust via Machine Learning-Reference-Cited by-同舟云学术

On Evaluating IoT Data Trust via Machine Learning

Published:2023-09-12 Issue:9 Volume:15 Page:309
ISSN:1999-5903
Container-title:Future Internet
language:en
Short-container-title:Future Internet

Author:

Tadj Timothy¹^ORCID,Arablouei Reza¹^ORCID,Dedeoglu Volkan¹

Affiliation:

1. Data61, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Pullenvale, QLD 4069, Australia

Abstract

Data trust in IoT is crucial for safeguarding privacy, security, reliable decision-making, user acceptance, and complying with regulations. Various approaches based on supervised or unsupervised machine learning (ML) have recently been proposed for evaluating IoT data trust. However, assessing their real-world efficacy is hard mainly due to the lack of related publicly available datasets that can be used for benchmarking. Since obtaining such datasets is challenging, we propose a data synthesis method, called random walk infilling (RWI), to augment IoT time-series datasets by synthesizing untrustworthy data from existing trustworthy data. Thus, RWI enables us to create labeled datasets that can be used to develop and validate ML models for IoT data trust evaluation. We also extract new features from IoT time-series sensor data that effectively capture its autocorrelation as well as its cross-correlation with the data of the neighboring (peer) sensors. These features can be used to learn ML models for recognizing the trustworthiness of IoT sensor data. Equipped with our synthesized ground-truth-labeled datasets and informative correlation-based features, we conduct extensive experiments to critically examine various approaches to evaluating IoT data trust via ML. The results reveal that commonly used ML-based approaches to IoT data trust evaluation, which rely on unsupervised cluster analysis to assign trust labels to unlabeled data, perform poorly. This poor performance is due to the underlying assumption that clustering provides reliable labels for data trust, which is found to be untenable. The results also indicate that ML models, when trained on datasets augmented via RWI and using the proposed features, generalize well to unseen data and surpass existing related approaches. Moreover, we observe that a semi-supervised ML approach that requires only about 10% of the data labeled offers competitive performance while being practically more appealing compared to the fully supervised approaches. The related Python code and data are available online.

Publisher

MDPI AG

Subject

Computer Networks and Communications

Link

https://www.mdpi.com/1999-5903/15/9/309/pdf

Reference20 articles.

1. Knowledge growth and development: Internet of things (IoT) research, 2006–2018;Dachyar;Heliyon,2019

2. Internet of things (IoT) trust concerns;Voas;NIST Tech. Rep.,2018

3. Jayasinghe, U., Otebolaku, A., Um, T.W., and Lee, G.M. (2017, January 27–29). Data centric trust evaluation and prediction framework for IoT. Proceedings of the ITU Kaleidoscope: Challenges for a Data-Driven Society (ITU K), Nanjing, China.

4. Machine Learning in IoT Security: Current Solutions and Future Challenges;Hussain;IEEE Commun. Surv. Tutor.,2020

5. Mahmoud, R., Yousuf, T., Aloul, F., and Zualkernan, I. (2015, January 14–16). Internet of things (IoT) security: Current status, challenges and prospective measures. Proceedings of the International Conference Internet Technology and Secured Transactions, London, UK.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Machine Learning for Data Trust Evaluations in Blockchain-Enabled IoT Systems;2024 IEEE International Conference on Blockchain and Cryptocurrency (ICBC);2024-05-27