OmniJet-α: the first cross-task foundation model for particle physics-Reference-Cited by-同舟云学术

OmniJet-α: the first cross-task foundation model for particle physics

Published:2024-08-02 Issue:3 Volume:5 Page:035031
ISSN:2632-2153
Container-title:Machine Learning: Science and Technology
language:
Short-container-title:Mach. Learn.: Sci. Technol.

Author:

Birk Joschka^ORCID,Hallin Anna^ORCID,Kasieczka Gregor^ORCID

Abstract

Abstract Foundation models are multi-dataset and multi-task machine learning methods that once pre-trained can be fine-tuned for a large variety of downstream applications. The successful development of such general-purpose models for physics data would be a major breakthrough as they could improve the achievable physics performance while at the same time drastically reduce the required amount of training time and data. We report significant progress on this challenge on several fronts. First, a comprehensive set of evaluation methods is introduced to judge the quality of an encoding from physics data into a representation suitable for the autoregressive generation of particle jets with transformer architectures (the common backbone of foundation models). These measures motivate the choice of a higher-fidelity tokenization compared to previous works. Finally, we demonstrate transfer learning between an unsupervised problem (jet generation) and a classic supervised task (jet tagging) with our new OmniJet-α model. This is the first successful transfer between two different and actively studied classes of tasks and constitutes a major step in the building of foundation models for particle physics.

Funder

PUNCH4NFDI

Deutsche Forschungsgemeinschaft

Publisher

IOP Publishing

Link

https://iopscience.iop.org/article/10.1088/2632-2153/ad66ad/pdf

Reference70 articles.

1. On the opportunities and risks of foundation models;Bommasani,2022

2. BERT: pre-training of deep bidirectional transformers for language understanding;Devlin,2019

3. BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension;Lewis,2019

4. Language models are few-shot learners;Brown,2020

5. LLaMA: open and efficient foundation language models;Touvron,2023

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Foundations of automatic feature extraction at LHC–point clouds and graphs;The European Physical Journal Special Topics;2024-09-11

2. Convolutional L2LFlows: generating accurate showers in highly granular calorimeters using convolutional normalizing flows;Journal of Instrumentation;2024-09-01