Affiliation:
1. Technische Universität Berlin, Berlin, Germany
2. Observe Inc., San Mateo, USA
3. Technische Universität Berlin, DFKI GmbH, Berlin, Germany
Abstract
Today's IoT applications exploit the capabilities of three different computation environments: sensors, edge, and cloud. Ensuring fault tolerance at the edge level presents unique challenges due to complex network hierarchies and the presence of resource-constrained computing devices. In contrast to the Cloud, the Edge lacks high availability standards and a persistent upstream backup. To ensure reliability, fault tolerance mechanisms have to be deployed on the edge devices along with processing operators competing for available resources. However, existing operator placement strategies are not aware of fault tolerance resource requirements, and existing fault tolerance approaches are not aware of available resources. This miscommunication in resource-constrained environments like the Edge leads to underprovisioning and failures. In this paper, we present a resource-aware fault-tolerance approach that takes the unique characteristics of the Edge into account to provide reliable stream processing. To this end, we model fault tolerance as an operator placement problem that uses multi-objective optimization to decide where to backup data. As opposed to existing approaches that treat operator placement and fault tolerance as two separate steps, we combine them and showcase that this is especially important for low-end edge devices. Overall, our approach effectively mitigates potential failures and outperforms state-of-the-art fault tolerance approaches by up to an order of magnitude in throughput.
Publisher
Association for Computing Machinery (ACM)
Reference83 articles.
1. 2007. Amazon AWS Greengrass. Accessed May 2023: https://aws.amazon.com/greengrass/.
2. 2011. John Wilkes. More Google Cluster Data. Google Research Blog,. Accessed Sep 2023: https: //ai.googleblog.com/2011/11/more-google-cluster-data.html.
3. Optimal operator placement for distributed stream processing applications
4. 2017. Microsoft Azure IoT Edge. Accessed Jul 2023: https://azure.microsoft.com/en-us/services/iot-edge/.
5. 2022. Calculated MTBF Estimates. Accessed Mar 2023: https://www.intel.com/content/dam/support/us/en/documents/ motherboards/server/sb/s1200rpcalculatedmtbfestimatesrev1_0.pdf.