Author:
Tian Xuecheng,Wang Shuaian
Abstract
Port state control (PSC) is the last line of defense for substandard ships. During a PSC inspection, ship detention is the most severe result if the inspected ship is identified with critical deficiencies. Regarding the development of ship detention prediction models, this paper identifies two challenges: learning from imbalanced data and learning from unlabeled data. The first challenge, imbalanced data, arises from the fact that a minority of inspected ships were detained. The second challenge, unlabeled data, arises from the fact that in practice not all foreign visiting ships receive a formal PSC inspection, leading to a missing data problem. To address these two challenges, this paper adopts two machine learning paradigms: cost-sensitive learning and semi-supervised learning. Accordingly, we expand the traditional logistic regression (LR) model by introducing a cost parameter to consider the different misclassification costs of unbalanced classes and incorporating a graph regularization term to consider unlabeled data. Finally, we conduct extensive computational experiments to verify the superiority of the developed cost-sensitive semi-supervised learning framework in this paper. Computational results show that introducing a cost parameter into LR can improve the classification rate for substandard ships by almost 10%. In addition, the results show that considering unlabeled data in classification models can increase the classification rate for minority and majority classes by 1.33% and 5.93%, respectively.
Subject
General Mathematics,Engineering (miscellaneous),Computer Science (miscellaneous)
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献