Author:
Albin Sam,Attebury Garhan,Bloom Kenneth,Bockelman Brian,Lundstedt Carl,Shadura Oksana,Thiltges John
Abstract
The large data volumes expected from the High Luminosity LHC (HL-LHC) present challenges to existing paradigms and facilities for end-user data analysis. Modern cyberinfrastructure tools provide a diverse set of services that can be composed into a system that provides physicists with powerful tools that give them straightforward access to large computing resources, with low barriers to entry. The Coffea-Casa analysis facility (AF) provides an environment for end users enabling the execution of increasingly complex analyses such as those demonstrated by the Analysis Grand Challenge (AGC) and capturing the features that physicists will need for the HL-LHC.
We describe the development progress of the Coffea-Casa facility featuring its modularity while demonstrating the ability to port and customize the facility software stack to other locations. The facility also facilitates the support of batch systems while staying Kubernetes-native. We present the evolved architecture of the facility, such as the integration of advanced data delivery services (e.g. ServiceX) and making data caching services (e.g. XCache) available to end users of the facility. We also highlight the composability of modern cyberinfrastructure tools. To enable machine learning pipelines at coffee-casa analysis facilities, a set of industry ML solutions adopted for HEP columnar analysis were integrated on top of existing facility services. These services also feature transparent access for user workflows to GPUs available at a facility via inference servers while using Kubernetes as enabling technology.
Reference24 articles.
1. Adamec M., Attebury G., Bloom K., Bockelman B., Lundstedt C., Shadura O., Thiltges J., Coffea-casa: an analysis facility prototype, in EPJ Web of Conferences (EDP Sciences, 2021), Vol. 251, p. 02061
2. Coffea Columnar Object Framework For Effective Analysis
3. ServiceX A Distributed, Caching, Columnar Data Delivery Service
4. Project Jupyter Contributors, “Zero to JupyterHub with Kubernetes”, JupyterHub for Kubernetes, 2023, https://z2jh.jupyter.org/
5. Brewer E.A., Kubernetes and the path to cloud native, in Proceedings of the sixth ACM symposium on cloud computing (2015), pp. 167–167