Affiliation:
1. University of Guelph, Guelph, ON N1G 2W1, Canada.
Abstract
Closed-circuit television inspections of sewer condition deterioration as required for proactive management are expensive and hence limited to portions of a sewer network. The data mining approach presented herein is shown capable of unlocking information contained within inspection records and enhances existing pipe inspection practices currently used in the wastewater industry. Predictive models developed using the random forests algorithm are found capable of predicting individual sewer pipe condition so that uninspected pipes in a sewer network with the greatest likelihood of being in a structurally defective condition state are identified for future rounds of inspection. Complications posed by imbalance between classes common within inspection datasets are overcome by first establishing the classification task in a binary format (where pipes are in either good or bad structural condition) and then using the receiver-operating characteristic (ROC) curve to establish alternative cutoffs for the predicted class probability. The random forests algorithm achieved a stratified test set false negative rate of 18%, false positive rate of 27% and an excellent area under the ROC curve of 0.81 in a case study application to the City of Guelph, Ontario, Canada. The novel inclusion of condition information of pipes attached at either the upstream or downstream manholes of an individual pipe enhances the predictive power for bad pipes representing the minority class of interest (reducing the false negative rate to 11%, reducing the false positive rate to 25% and increasing the area under the ROC curve to 0.85). An area under the ROC curve >0.80 indicates random forests are an “excellent” choice for predicting the condition of individual pipes in a sewer network.
Publisher
Canadian Science Publishing
Subject
General Environmental Science,Civil and Structural Engineering
Cited by
65 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献