The Quest to solve the HL-LHC data access puzzle

Author:

Espinal X.,Jezequel S.,Schulz M.,Sciabà A.,Vukotic I.,Wuerthwein F.

Abstract

HL-LHC will confront the WLCG community with enormous data storage, management and access challenges. These are as much technical as economical. In the WLCG-DOMA Access working group, members of the experiments and site managers have explored different models for data access and storage strategies to reduce cost and complexity, taking into account the boundary conditions given by our community.Several of these scenarios have been evaluated quantitatively, such as the Data Lake model and incremental improvements of the current computing model with respect to resource needs, costs and operational complexity.To better understand these models in depth, analysis of traces of current data accesses and simulations of the impact of new concepts have been carried out. In parallel, evaluations of the required technologies took place. These were done in testbed and production environments at small and large scale.We will give an overview of the activities and results of the working group, describe the models and summarise the results of the technology evaluation focusing on the impact of storage consolidation in the form of Data Lakes, where the use of streaming caches has emerged as a successful approach to reduce the impact of latency and bandwidth limitation.We will describe the experience and evaluation of these approaches in different environments and usage scenarios. In addition we will present the results of the analysis and modelling efforts based on data access traces of the experiments.

Publisher

EDP Sciences

Reference8 articles.

1. Bird I., Campana S. https://cds.cern.ch/record/2621698

2. HEP Software Foundation, https://hepsoftwarefoundation.org/

3. HEP Software Foundation, A Roadmap for HEP Software and Computing R&D for the 2020s, arXiv:1712.06982 (2018)

4. A further reduction in CMS event data for analysis: the NANOAOD format

5. Hanushevsky A. et al, https://xrootd.slac.stanford.edu/

Cited by 6 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Development of the CMS detector for the CERN LHC Run 3;Journal of Instrumentation;2024-05-01

2. Experiences in deploying in-network data caches;EPJ Web of Conferences;2024

3. Predicting Resource Utilization Trends with Southern California Petabyte Scale Cache;EPJ Web of Conferences;2024

4. A case study of content delivery networks for the CMS ex-periment;EPJ Web of Conferences;2024

5. Effectiveness and predictability of in-network storage cache for Scientific Workflows;2023 International Conference on Computing, Networking and Communications (ICNC);2023-02-20

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3