Abstract
Abstract
In the initial stages of operation for future tokamak, facing limited data availability, deploying data-driven disruption predictors requires optimal performance with minimal use of new device data. This paper studies the issue of data utilization in data-driven disruption predictor during cross tokamak deployment. Current predictors primarily employ supervised learning methods and require a large number of disruption and non-disruption shots for training. However, the scarcity and high cost of obtaining disruption shots for future tokamaks result in imbalanced training datasets, reducing the performance of supervised learning predictors. To solve this problem, we propose the Enhanced Convolutional Autoencoder Anomaly Detection (E-CAAD) predictor. E-CAAD can be trained only by non-disruption samples and can also be trained by disruption precursor samples when disruption shots occur. This model not only overcomes the sample imbalance in supervised learning predictors, but also overcomes the inefficient dataset utilization faced by traditional anomaly detection predictors that cannot use disruption precursor samples for training, making it more suitable for the unpredictable datasets of future tokamaks. Compared to traditional anomaly detection predictors, the E-CAAD predictor performs better in disruption prediction and is deployed faster on new devices. Additionally, we explore strategies to accelerate the deployment of the E-CAAD predictor on the new device by using data from existing devices. Two deployment strategies are presented: mixing data from existing devices and fine-tuning the predictor trained on existing devices. Our comparisons indicate that the data from existing device can accelerate the deployment of predictor on new device. Notably, the fine-tuning strategy yields the fastest deployment on new device among the designed strategies.
Funder
National MCF Energy R&D Program of China
National Natural Science Foundation of China