Affiliation:
1. CIEMAT, Madrid, Spain
2. GRNET, Athens, Greece
3. Universidade da Coruña, A Coruña, Spain
4. Universidade de Vigo, Vigo, Spain
Abstract
Nowadays, computing calculations are becoming more and more demanding due to the huge pool of resources available. This demand must be satisfied in terms of computational efficiency and resilience, which is compromised in distributed and heterogeneous platforms. Not only this, data obtained are often either reused by other researchers or recalculated. In this work, a set of tools to overcome the problem of creating and executing fault tolerant distributed applications on dynamic environments is presented. Such a set also ensures the reproducibility of the performed experiments providing a portable, unattended and resilient framework that encapsulates the infrastructure-dependent operations away from the application developers and users, allowing performing experiments based on Open Access data repositories. In this way, users can seamlessly search and lately access datasets that can be automatically retrieved as input data into a code already integrated in the proposed workflow. Such a search is based on metadata standards and relies on Persistent Identifiers (PID) to assign specific repositories. The applications profit from Distributed Toolbox, a framework devoted to the creation and execution of distributed applications and includes tools for unattended cluster and grid execution, where a total fault tolerance is provided. By decoupling the definition of the remote tasks from its execution and control, the development, execution and maintenance of distributed applications is significantly simplified with respect to previous solutions, increasing their robustness and allowing running them on different computational platforms with little effort. The integration with Open Access databases and employment of PIDs for long-lasting references ensures that the data related to the experiments will persist, closing a complete research circle of data access/processing/storage/dissemination of results.
Subject
Computer Networks and Communications