Detecting Outliers in Cardiopulmonary Exercise Testing Data of Ski Racers – A Comparison of Methods and their Effect on the Performance of Fatigue Prediction
Author:
Baumgartner N.1, Kranzinger C.1, Kranzinger S.1, Snyder C.23, Stöggl T.23, Resch B.45
Affiliation:
1. 1 Salzburg Research Forschungsgesellschaft mbH , Salzburg , Austria 2. 2 Red Bull Athlete Performance Center , Thalgau , Austria 3. 3 Department of Sport and Exercise Science , University of Salzburg , Austria 4. 4 Department of Geoinformatics Z_GIS , University of Salzburg , Austria 5. 5 Center for Geographic Analysis , Harvard University , Cambridge MA , USA
Abstract
Abstract
In sports science, cardiopulmonary data is used to assess exercise intensity, performance and health status of athletes and derive relevant target values. However, sensors may produce flawed data and data may include a wide variety of artifacts, which could potentially lead to false conclusions. Thus, appropriate and customized pre-processing algorithms are a vital prerequisite for producing reliable and valid analysis results. To find adequate outlier detection methods for this type of data, we compared three algorithms by applying them on seven ergospirometric measures of junior ski racing athletes and applied a model to predict fatigue during skiing based on the pre-processed data. While values that lie outside a realistic spectrum were consistently labelled as outliers by all methods, and mean values and standard deviations changed in similar ways, methods differed from each other when it comes to changing trends, recurring patterns, and subsequent outliers. Decomposing the sensor data into different components (trend, seasonality, remainder) before dealing with outliers increased average predictive performance the most. However, pre-processing remarkably improved prediction results for certain study participants and not for others. Thus, handling outliers correctly prior to deriving information from ergospirometric data is recommended but more research should be conducted to find methods that achieve more consistent improvement.
Publisher
Walter de Gruyter GmbH
Subject
Biomedical Engineering,General Computer Science
Reference22 articles.
1. Barroso, M. T. C., Hoppe, M. W., Boehme, P., Krahn, T., Kiefer, C., Kramer, F., Mondritzki, T., Pirez, P., & Dinh, W. (2019). Test-retest reliability of non-invasive cardiac output measurement during exercise in healthy volunteers in daily clinical routine. Arquivos Brasileiros de Cardiologia, 113, 231–239. 2. Basu, S., & Meckesheimer, M. (2007). Automatic outlier detection for time series: an application to sensor data. Knowledge and Information Systems, 11(2), 137–154. 3. Baumgartner, N., Kranzinger, C., Kranzinger S., Snyder, C., Stöggl, T., & Resch, B. (2022). A Comparison of Methods for Automatic Outlier Detection in Ergospirometric Data and their Effect on the Performance of Predictive Models. Proceedings of 13th World Congress of Performance Analysis of Sport & the 13th International Symposium on Computer Science in Sport. Manuscript submitted for publication. 4. Blázquez-García, A., Conde, A., Mori, U., & Lozano, J. A. (2021). A review on outlier/anomaly detection in time series data. ACM Computing Surveys (CSUR), 54(3), 1–33. 5. Breiman, L. (2001). Random forests. Machine learning, 45(1), 5–32.
|
|