CERN Disk Storage Services: Report from last data taking, evolution and future outlook towards Exabyte-scale storage

Author:

Mascetti Luca,Arsuaga Rios Maria,Bocchi Enrico,Calado Vicente Joao,Chan Kwok Cheong Belinda,Castro Diogo,Collet Julien,Contescu Cristian,Gonzalez Labrador Hugo,Iven Jan,Lamanna Massimo,Lo Presti Giuseppe,Mouratidis Theofilos,Mościcki Jakub T.,Musset Paul,Pelletier Remy,Valverde Cameselle Roberto,Van Der Ster Daniel

Abstract

The CERN IT Storage group operates multiple distributed storage systems to support all CERN data storage requirements: the physics data generated by LHC and non-LHC experiments; object and file storage for infrastructure services; block storage for the CERN cloud system; filesystems for general use and specialized HPC clusters; content distribution filesystem for software distribution and condition databases; and sync&share cloud storage for end-user files. The total integrated capacity of these systems exceeds 0.6 Exabyte.Large-scale experiment data taking has been supported by EOS and CASTOR for the last 10+ years. Particular highlights for 2018 include the special HeavyIon run which was the last part of the LHC Run2 Programme: the IT storage systems sustained over 10GB/s to flawlessly collect and archive more than 13 PB of data in a single month. While the tape archival continues to be handled by CASTOR, the effort to migrate the current experiment workflows to the new CERN Tape Archive system (CTA) is underway.Ceph infrastructure has operated for more than 5 years to provide block storage to CERN IT private OpenStack cloud, a shared filesystem (CephFS) to HPC clusters and NFS storage to replace commercial Filers. S3 service was introduced in 2018, following increased user requirements for S3-compatible object storage from physics experiments and IT use-cases.Since its introduction in 2014N, CERNBox has become a ubiquitous cloud storage interface for all CERN user groups: physicists, engineers and administration. CERNBox provides easy access to multi-petabyte data stores from a multitude of mobile and desktop devices and all mainstream, modern operating systems (Linux, Windows, macOS, Android, iOS). CERNBox provides synchronized storage for end-user’s devices as well as easy sharing for individual users and e-groups. CERNBox has also become a storage platform to host online applications to process the data such as SWAN (Service for Web-based Analysis) as well as file editors such as Collabora Online, Only Office, Draw.IO and more. An increasing number of online applications in the Windows infrastructure uses CIFS/SMB access to CERNBox files.CVMFS provides software repositories for all experiments across the WLCG infrastructure and has recently been optimized to efficiently handle nightlybuilds. While AFS continues to provide general-purpose filesystem for internal CERN users, especially as $HOME login area on central computing infrastructure, the migration of project and web spaces has significantly advanced.In this paper, we report on the experiences from the last year of LHC RUN2 data taking and evolution of our services in the past year.. We will highlight upcoming changes and future improvements and challenges.

Publisher

EDP Sciences

Reference22 articles.

1. Castro Leon J., RadosGW Keystone Sync, CERN TechBlog, https://techblog.web.cern.ch/techblog/post/radosgw_sync_ec2_keys/

2. CERN Advanced STORage manager, http://cern.ch/castor

3. Cano E. et al., CERN Tape Archive: production status, migration from CASTOR and new features, CHEP 2019 Proceedings.

4. Contescu C., Lo Presti G., Rousseau H., Data Taking for the Heavy-Ion Run 2018, CERN IT Note 2019-006

Cited by 7 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Tele-Trafficking of Virtual Data Storage Obtained from Smart Grid by Replicated Gluster in Syntose Environment;Energies;2024-05-13

2. Scalable and Portable Federated Learning Simulation Engine;Proceedings of the 3rd Eclipse Security, AI, Architecture and Modelling Conference on Cloud to Edge Continuum;2023-10-17

3. LibCOS: Enabling Converged HPC and Cloud Data Stores with MPI;Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region;2023-02-27

4. Evaluation and Implementation of Various Persistent Storage Options for CMSWEB Services in Kubernetes Infrastructure at CERN;Journal of Physics: Conference Series;2023-02-01

5. Performance Evaluations of Distributed File Systems for Scientific Big Data in FUSE Environment;Electronics;2021-06-18

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3