Database Native Model Selection: Harnessing Deep Neural Networks in Database Systems-Reference-Cited by-同舟云学术

Database Native Model Selection: Harnessing Deep Neural Networks in Database Systems

Published:2024-01 Issue:5 Volume:17 Page:1020-1033
ISSN:2150-8097
Container-title:Proceedings of the VLDB Endowment
language:en
Short-container-title:Proc. VLDB Endow.

Author:

Xing Naili¹,Cai Shaofeng¹,Chen Gang²,Luo Zhaojing¹,Ooi Beng Chin¹,Pei Jian³

Affiliation:

1. National University of Singapore

2. Zhejiang University

3. Duke University

Abstract

The growing demand for advanced analytics beyond statistical aggregation calls for database systems that support effective model selection of deep neural networks (DNNs). However, existing model selection strategies are based on either training-based algorithms that deliver high-performing models at the expense of high computational cost, or training-free algorithms that enhance computational efficiency with reduced effectiveness. These strategies often disregard computational cost and response time Service-Level Objectives (SLOs), which are of concern to average or budget-conscious machine learning users. In addition, they lack a well-designed integration of the model selection algorithms with DBMSs, which hinders efficient in-database model selection. This paper presents TRAILS, a resource-efficient and SLO-aware in-database model selection system. To leverage the strengths of both training-free and training-based model selection, we first characterize nine state-of-the-art training-free model evaluation metrics and propose a more effective one named JacFlow, and then, restructure the conventional model selection procedure into two phases: filtering and refinement. A novel coordinator is also introduced to strike a balance between the high efficiency of train-free algorithms and the high effectiveness of training-based algorithms, ensuring high-performing model selection while adhering to target SLOs. Moreover, we incorporate the proposed algorithm into PostgreSQL to develop TRAILS, thereby both enhancing resource efficiency and reducing model selection latency. This integration establishes a foundation for declarative model definition and selection within DBMSs. Empirical results demonstrate that our TRAILS reduces model selection time and computational expenses considerably by up to 24.38x and 29.32x respectively compared to existing model selection systems.

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.14778/3641204.3641212

Reference68 articles.

1. 2023. Automate machine learning model selection with Azure Machine Learning. https://learn.microsoft.com/en-us/training/modules/automate-model-selection-with-azure-automl/1-introduction.

2. 2023. AutoML: Train High-quality Custom Machine Learning Models with Minimal Effort and Machine Learning Expertise. https://cloud.google.com/automl

3. 2023. AWS AutoML Solutions. https://aws.amazon.com/machine-learning/automl/.

4. 2023. PostgreSQL. https://www.postgresql.org/.

5. Mohamed S. Abdelfattah, Abhinav Mehrotra, Lukasz Dudziak, and Nicholas Donald Lane. 2021. Zero-Cost Proxies for Lightweight NAS. In Proceedings of The International Conference on Learning Representations.