1. Understanding the propagation of transient errors in HPC applications
2. Fault injection for formal testing of fault tolerance
3. Satish Balay Shrirang Abhyankar Mark F. Adams Jed Brown Peter Brune Kris Buschelman Lisandro Dalcin Victor Eijkhout William D. Gropp Dinesh Kaushik Matthew G. Knepley Dave A. May Lois Curfman McInnes Karl Rupp Barry F. Smith Stefano Zampini Hong Zhang and Hong Zhang. 2017. PETSc Web page. http://www.mcs.anl.gov/petsc. (2017). http://www.mcs.anl.gov/petsc Satish Balay Shrirang Abhyankar Mark F. Adams Jed Brown Peter Brune Kris Buschelman Lisandro Dalcin Victor Eijkhout William D. Gropp Dinesh Kaushik Matthew G. Knepley Dave A. May Lois Curfman McInnes Karl Rupp Barry F. Smith Stefano Zampini Hong Zhang and Hong Zhang. 2017. PETSc Web page. http://www.mcs.anl.gov/petsc. (2017). http://www.mcs.anl.gov/petsc
4. Fault injection experiments using FIAT
5. Eduardo Berrocal Leonardo Bautista-Gomez Sheng Di Zhiling Lan and Franck Cappello. 2015. Lightweight Silent Data Corruption Detection Based on Runtime Data Analysis for HPC Applications. In HPDC. 275--278. 10.1145/2749246.2749253 Eduardo Berrocal Leonardo Bautista-Gomez Sheng Di Zhiling Lan and Franck Cappello. 2015. Lightweight Silent Data Corruption Detection Based on Runtime Data Analysis for HPC Applications. In HPDC . 275--278. 10.1145/2749246.2749253