Author:
Vardan Gyurjyan,David Abbott,Michael Goodrich,Graham Heyes,Ed Jastrzembski,David Lawrence,Benjamin Raydo,Carl Timmer
Abstract
With the exponential growth in the volume and complexity of data generated at high-energy physics and nuclear physics research facilities, there is an imperative demand for innovative strategies to process this data in real or near-real-time. Given the surge in the requirement for high-performance computing, it becomes pivotal to reassess the adaptability of current data processing architectures in integrating new technologies and managing streaming data. This paper introduces the ERSAP framework, a modern solution that synergizes flow-based programming with the reactive actor model, paving the way for distributed, reactive, and high performance in data stream processing applications. Additionally, we unveil a novel algorithm focused on time-based clustering and event identification in data streams. The efficacy of this approach is further exemplified through the data-stream processing outcomes obtained from the recent beam tests of the EIC prototype calorimeter at DESY.
Reference7 articles.
1. Streaming readout for next generation electron scattering experiments
2. “Flow-based Programming, 2nd Edition: A New Approach to Application Development”, CreateSpace Independent Publishing Platform, 2010
3. Fischer Josh, Wang Ning. Grokking streaming systems. Manning publications. March 2022
4. Combining DBSCAN and Grid-Based Clustering For Performance Analysis by Vijay Rai and LAP Lambert Academic Publishing, June 2019, ISBN 978-6200102256