Affiliation:
1. TSYS School of Computer Science, Turner College of Business, Columbus State University, Columbus, GA 31907, USA
Abstract
We have designed a real-world smart building energy fault detection (SBFD) system on a cloud-based Databricks workspace, a high-performance computing (HPC) environment for big-data-intensive applications powered by Apache Spark. By avoiding a Smart Building Diagnostics as a Service approach and keeping a tightly centralized design, the rapid development and deployment of the cloud-based SBFD system was achieved within one calendar year. Thanks to Databricks’ built-in scheduling interface, a continuous pipeline of real-time ingestion, integration, cleaning, and analytics workflows capable of energy consumption prediction and anomaly detection was implemented and deployed in the cloud. The system currently provides fault detection in the form of predictions and anomaly detection for 96 buildings on an active military installation. The system’s various jobs all converge within 14 min on average. It facilitates the seamless interaction between our workspace and a cloud data lake storage provided for secure and automated initial ingestion of raw data provided by a third party via the Secure File Transfer Protocol (SFTP) and BLOB (Binary Large Objects) file system secure protocol drivers. With a powerful Python binding to the Apache Spark distributed computing framework, PySpark, these actions were coded into collaborative notebooks and chained into the aforementioned pipeline. The pipeline was successfully managed and configured throughout the lifetime of the project and is continuing to meet our needs in deployment. In this paper, we outline the general architecture and how it differs from previous smart building diagnostics initiatives, present details surrounding the underlying technology stack of our data pipeline, and enumerate some of the necessary configuration steps required to maintain and develop this big data analytics application in the cloud.
Funder
Columbus State University
Reference15 articles.
1. Machine learning for Energy Consumption Prediction and Scheduling in Smart Buildings;Bourhnane;Spring Nat. Appl. Sci. J.,2020
2. Mohamed, N., Lazarova-Molnar, S., and Al-Jaroodi, J. (2016, January 27–29). SBDaaS: Smart Building Diagnostics as a Service on the Cloud. Proceedings of the 2016 2nd International Conference on Intelligent Green Building and Smart Grid (IBSG), Prague, Czech Republic.
3. Stamatescu, I., Bolboaca, V., and Stamatescu, G. (2018, January 4–7). Distributed Monitoring of Smart Buildings with Cloud Backend Infrastructure. Proceedings of the 2018 International Conference on Control, Decision and Information Technologies (CoDIT’18), Orlando, FL, USA.
4. Big data processing for smart grids;Benhaddou;IADIS Int. J. Comput. Sci. Inf. Syst.,2015
5. Morisio, M., Torchiano, M., and Jedlitschka, A. (2020). Product-Focused Software Process Improvement, Springer. Lecture Notes in Computer Science.