Affiliation:
1. School of Computer Science and Engineering University of New South Wales Sydney New South Wales Australia
2. Defence Science and Technology Group Information Sciences Division Edinburgh South Australia Australia
Abstract
SummaryHuman activity recognition is a well‐established research problem in ubiquitous computing. The increased dependency on various smart devices in our daily lives allows us to investigate the sensor data world produced by multimodal sensors embedded in smart devices. However, the raw sensor data are often unlabeled and annotating this vast amount of data are a costly exercise that can often lead to privacy breaches. Self‐supervised learning‐based approaches are at the forefront of learning semantic representation from unlabeled sensor data, including when applied to human activity recognition tasks. As inferring human activity depends on multimodal sensors, addressing the modality difference and inter‐modality dependencies in a model is an important process. This paper proposes a novel self‐supervised learning approach, modality aware contrastive learning (MACL), for representation learning using multimodal sensor data. The approach uses different sensing modalities to create different views of an input signal. Thus, the model is able to learn the representations by maximizing the similarity among different sensing modalities of the same input signal. Extensive experiments were performed on four publicly available human activity recognition data sets to verify the effectiveness of our proposed MACL method. The experimental evaluation results show that the MACL method attains a comparable performance for human activity recognition to the compared baseline models, directly exceeding the performance of models using standard augmentation transformation strategies.
Funder
Defence Science and Technology Group
Reference38 articles.
1. A simple framework for contrastive learning of, visual representations, in: Proceedings of the 37th International Conference on Machine Learning July 2020;Chen T;Proc Mach Learn Res,2020
2. Multi-task Self-Supervised Learning for Human Activity Detection
3. Masked reconstruction based self-supervision for human activity recognition
4. Self-Supervised Learning for ECG-Based Emotion Recognition
5. EldeleE RagabM ChenZ et al.Time‐series representation learning via temporal and contextual contrasting arXiv preprint arXiv:2106.14112.2021.