Darshan for HEP applications


Wang Rui,Snyder Shane,Benjamin Douglas,Dong Zhihua,Gartung Patrick,Herner Kenneth


Modern HEP workflows must manage increasingly large and complex data collections. HPC facilities may be employed to help meet these workflows’ growing data processing needs. However, a better understanding of the I/O patterns and underlying bottlenecks of these workflows is necessary to meet the performance expectations of HPC systems. Darshan is a lightweight I/O characterization tool that captures concise views of HPC application I/O behavior. It intercepts application I/O calls at runtime, records file access statistics for each process, and generates log files detailing application I/O access patterns. Typical HEP workflows include event generation, detector simulation, event reconstruction, and subsequent analysis stages. A study of the I/O behavior of the ATLAS simulation and filtering stage, and the CMS simulation workflow using Darshan is presented, including insights into the I/O operations and data access size.


EDP Sciences

Reference13 articles.

1. Carns P., Latham R., Ross R., Iskra K., Lang S., Riley K., 24/7 characterization of petascale I/O workloads, in 2009 IEEE International Conference on Cluster Computing and Workshops (IEEE, 2009), pp. 1–10

2. Xu C., Snyder S., Kulkarni O., Venkatesan V., Carns P., Byna S., Sisneros R., Chadalavada K., DXT: Darshan eXtended Tracing (2019), https://www.osti.gov/ biblio/1490709

3. PyDarshan, https://pypi.org/project/darshan/

4. Tech. rep., CERN, Geneva (2021), all figures including auxiliary figures are available at https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/PUBNOTES/ATL-SOFT-PUB-2021-001, https://cds.cern.ch/record/2767187








Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3