Affiliation:
1. Purdue University, West Lafayette, IN, USA
2. University of Illinois Urbana-Champaign, Urbana, IL, USA
Abstract
Cloud functions, exemplified by AWS Lambda and Azure Functions, are emerging as a new computing paradigm in the cloud. They provide elastic, serverless, and low-cost cloud computing, making them highly suitable for bursty and sparse workloads, which are quite common in practice. Thus, there is a new trend in designing data systems that leverage cloud functions. In this paper, we focus on vector databases, which have recently gained significant attention partly due to large language models. In particular, we investigate how to use cloud functions to build high-performance and cost-efficient vector databases. This presents significant challenges in terms of how to perform sharding, how to reduce communication overhead, and how to minimize cold-start times.
In this paper, we introduce Vexless, the first vector database system optimized for cloud functions. We present three optimizations to address the challenges. To perform sharding, we propose a global coordinator (orchestrator) that assigns workloads to Cloud function instances based on their available hardware resources. To overcome communication overhead, we propose the use of stateful cloud functions, eliminating the need for costly communications during synchronization. To minimize cold-start overhead, we introduce a workload-aware Cloud function lifetime management strategy. Vexless has been implemented using Azure Functions. Experimental results demonstrate that Vexless can significantly reduce costs, especially on bursty and sparse workloads, compared to cloud VM instances, while achieving similar or higher query performance and accuracy.
Publisher
Association for Computing Machinery (ACM)
Reference75 articles.
1. [n. d.]. Alibaba Cloud: Manage Stateful Asynchronous Invocations. https://www.alibabacloud.com/help/en/fc/developer-reference/manage-stateful-asynchronous-invocations.
2. [n. d.]. Alibaba Cloud: Message Service (MNS). https://www.alibabacloud.com/product/message-service.
3. [n. d.]. Amazon Simple Queue Service. https://aws.amazon.com/sqs.
4. [n. d.]. AWS Lambda - Serverless Compute - Amazon Web Services. https://aws.amazon.com/lambda.
5. [n. d.]. AWS Step Functions. https://aws.amazon.com/step-functions.
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Survey of vector database management systems;The VLDB Journal;2024-07-15
2. Vector Database Management Techniques and Systems;Companion of the 2024 International Conference on Management of Data;2024-06-09