Are Transformers Effective for Time Series Forecasting?-Reference-Cited by-同舟云学术

Are Transformers Effective for Time Series Forecasting?

Published:2023-06-26 Issue:9 Volume:37 Page:11121-11128
ISSN:2374-3468
Container-title:Proceedings of the AAAI Conference on Artificial Intelligence
language:
Short-container-title:AAAI

Author:

Zeng Ailing,Chen Muxi,Zhang Lei,Xu Qiang

Abstract

Recently, there has been a surge of Transformer-based solutions for the long-term time series forecasting (LTSF) task. Despite the growing performance over the past few years, we question the validity of this line of research in this work. Specifically, Transformers is arguably the most successful solution to extract the semantic correlations among the elements in a long sequence. However, in time series modeling, we are to extract the temporal relations in an ordered set of continuous points. While employing positional encoding and using tokens to embed sub-series in Transformers facilitate preserving some ordering information, the nature of the permutation-invariant self-attention mechanism inevitably results in temporal information loss. To validate our claim, we introduce a set of embarrassingly simple one-layer linear models named LTSF-Linear for comparison. Experimental results on nine real-life datasets show that LTSF-Linear surprisingly outperforms existing sophisticated Transformer-based LTSF models in all cases, and often by a large margin. Moreover, we conduct comprehensive empirical studies to explore the impacts of various design elements of LTSF models on their temporal relation extraction capability. We hope this surprising finding opens up new research directions for the LTSF task. We also advocate revisiting the validity of Transformer-based solutions for other time series analysis tasks (e.g., anomaly detection) in the future.

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

General Medicine

Cited by 353 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. TFformer: A time–frequency domain bidirectional sequence-level attention based transformer for interpretable long-term sequence forecasting;Pattern Recognition;2025-02

2. Dynamic convolutional time series forecasting based on adaptive temporal bilateral filtering;Pattern Recognition;2025-02

3. Data-driven stock forecasting models based on neural networks: A review;Information Fusion;2025-01

4. NPFormer: Interpretable rotating machinery fault diagnosis architecture design under heavy noise operating scenarios;Mechanical Systems and Signal Processing;2025-01

5. STELLM: Spatio-temporal enhanced pre-trained large language model for wind speed forecasting;Applied Energy;2024-12