TransNeural: An Enhanced-Transformer-Based Performance Pre-Validation Model for Split Learning Tasks-Reference-Cited by-同舟云学术

TransNeural: An Enhanced-Transformer-Based Performance Pre-Validation Model for Split Learning Tasks

Published:2024-08-09 Issue:16 Volume:24 Page:5148
ISSN:1424-8220
Container-title:Sensors
language:en
Short-container-title:Sensors

Author:

Liu Guangyi¹,Kang Mancong²,Zhu Yanhong¹³,Zheng Qingbi¹⁴,Zhu Maosheng⁵,Li Na¹⁴

Affiliation:

1. China Mobile Research Institute, Beijing 100053, China

2. School of Communications and Information Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China

3. School of Electronics and Information Engineering, Beijing Jiaotong University, Beijing 100091, China

4. ZGC Institute of Ubiquitous-X Innovation and Application, Beijing 100191, China

5. China Mobile (Suzhou) Software Technology Co., Ltd., Suzhou 215163, China

Abstract

While digital twin networks (DTNs) can potentially estimate network strategy performance in pre-validation environments, they are still in their infancy for split learning (SL) tasks, facing challenges like unknown non-i.i.d. data distributions, inaccurate channel states, and misreported resource availability across devices. To address these challenges, this paper proposes a TransNeural algorithm for DTN pre-validation environment to estimate SL latency and convergence. First, the TransNeural algorithm integrates transformers to efficiently model data similarities between different devices, considering different data distributions and device participate sequence greatly influence SL training convergence. Second, it leverages neural network to automatically establish the complex relationships between SL latency and convergence with data distributions, wireless and computing resources, dataset sizes, and training iterations. Deviations in user reports are also accounted for in the estimation process. Simulations show that the TransNeural algorithm improves latency estimation accuracy by 9.3% and convergence estimation accuracy by 22.4% compared to traditional equation-based methods.

Funder

National Key R&D Program of China

Publisher

MDPI AG

Link

https://www.mdpi.com/1424-8220/24/16/5148/pdf

Reference25 articles.

1. An, K., Sun, Y., Lin, Z., Zhu, Y., Ni, W., Al-Dhahir, N., Wong, K.K., and Niyato, D. (2024). Exploiting Multi-Layer Refracting RIS-Assisted Receiver for HAP-SWIPT Networks. IEEE Trans. Wirel. Commun., 1.

2. Supporting IoT With Rate-Splitting Multiple Access in Satellite and Aerial-Integrated Networks;Lin;IEEE Internet Things J.,2021

3. Wang, Y., Yang, C., Lan, S., Zhu, L., and Zhang, Y. (2024). End-Edge-Cloud Collaborative Computing for Deep Learning: A Comprehensive Survey. IEEE Commun. Surv. Tutor., 1.

4. A survey on federated learning: Challenges and applications;Wen;Int. J. Mach. Learn. Cybern.,2023

5. Shen, X., Liu, Y., Liu, H., Hong, J., Duan, B., Huang, Z., Mao, Y., Wu, Y., and Wu, D. (2023). A Split-and-Privatize Framework for Large Language Model Fine-Tuning. arXiv.