Cache-aware load balancing of data center applications-Reference-Cited by-同舟云学术

Cache-aware load balancing of data center applications

Published:2019-02 Issue:6 Volume:12 Page:709-723
ISSN:2150-8097
Container-title:Proceedings of the VLDB Endowment
language:en
Short-container-title:Proc. VLDB Endow.

Author:

Archer Aaron¹,Aydin Kevin¹,Bateni Mohammad Hossein¹,Mirrokni Vahab¹,Schild Aaron²,Yang Ray¹,Zhuang Richard¹

Affiliation:

1. Google

2. UC Berkeley

Abstract

Our deployment of cache-aware load balancing in the Google web search backend reduced cache misses by ~0.5x, contributing to a double-digit percentage increase in the throughput of our serving clusters by relieving a bottleneck. This innovation has benefited all production workloads since 2015, serving billions of queries daily. A load balancer forwards each query to one of several identical serving replicas. The replica pulls each term's postings list into RAM from flash, either locally or over the network. Flash bandwidth is a critical bottleneck, motivating an application-directed RAM cache on each replica. Sending the same term reliably to the same replica would increase the chance it hits cache, and avoid polluting the other replicas' caches. However, most queries contain multiple terms and we have to send the whole query to one replica, so it is not possible to achieve a perfect partitioning of terms to replicas. We solve this via a voting scheme, whereby the load balancer conducts a weighted vote by the terms in each query, and sends the query to the winning replica. We develop a multi-stage scalable algorithm to learn these weights. We first construct a large-scale term-query graph from logs and apply a distributed balanced graph partitioning algorithm to cluster each term to a preferred replica. This yields a good but simplistic initial voting table, which we then iteratively refine via cache simulation to capture feedback effects.

Publisher

VLDB Endowment

Subject

General Earth and Planetary Sciences,Water Science and Technology,Geography, Planning and Development

Link

https://dl.acm.org/doi/pdf/10.14778/3311880.3311887

Cited by 13 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Diversity-aware strategies for static index pruning;Information Processing & Management;2024-09

2. Palette Load Balancing: Locality Hints for Serverless Functions;Proceedings of the Eighteenth European Conference on Computer Systems;2023-05-08

3. More Recent Advances in (Hyper)Graph Partitioning;ACM Computing Surveys;2023-03-02

4. PKache: A Generic Framework for Data Plane Caching;2023

5. Promethean Utilization of Resources Using Honeybee Optimization Techniques in Cloud Computing with Reference to Pandemic Health Care;Computational Intelligence for Clinical Diagnosis;2023