Author:
Mastelini Saulo Martiello,Carvalho André Carlos Ponce de Leon Ferreira de
Abstract
The fast development of technology resulted in the constant production of data in different forms and from different sources. Contrary to what was observed in the first machine learning (ML) research works, there might be too much data to handle with traditional algorithms. Changes in the underlying data distributions might also render traditional ML solutions useless in real-world applications. Online ML (OML) aims to create solutions able to process data incrementally, with limited computation resource usage, and to deal with time-changing data distributions. Unfortunately, we have seen a recent growing trend in creating OML algorithms that solely focus on predictive performance and overlook computational costs. In regression tasks, the problem is even more pronounced when considering some of the most popular OML solutions: decision trees, decision rules, and ensembles of these models. In this thesis, we created improved and efficient OML algorithms from the mentioned algorithmic families by focusing on decreasing time and memory costs while keeping competitive predictive performance. Our proposals are either novel standalone OML algorithms or additions that can be paired with any existing tree or decision rule regressors.
Publisher
Sociedade Brasileira de Computação - SBC