Prediction and Predictability for Search Query Acceleration

Author:

Hwang Seung-Won1,Kim Saehoon2,He Yuxiong3,Elnikety Sameh3,Choi Seungjin2

Affiliation:

1. Yonsei University, Seoul, Korea

2. POSTECH

3. Microsoft Research

Abstract

A commercial web search engine shards its index among many servers, and therefore the response time of a search query is dominated by the slowest server that processes the query. Prior approaches target improving responsiveness by reducing the tail latency , or high-percentile response time, of an individual search server. They predict query execution time, and if a query is predicted to be long-running, it runs in parallel; otherwise, it runs sequentially. These approaches are, however, not accurate enough for reducing a high tail latency when responses are aggregated from many servers because this requires each server to reduce a substantially higher tail latency (e.g., the 99.99th percentile), which we call extreme tail latency. To address tighter requirements of extreme tail latency, we propose a new design space for the problem, subsuming existing work and also proposing a new solution space. Existing work makes a prediction using features available at indexing time and focuses on optimizing prediction features for accelerating tail queries. In contrast, we identify “when to predict?” as another key optimization question. This opens up a new solution of delaying a prediction by a short duration to allow many short-running queries to complete without parallelization and, at the same time, to allow the predictor to collect a set of dynamic features using runtime information. This new question expands a solution space in two meaningful ways. First, we see a significant reduction of tail latency by leveraging “dynamic” features collected at runtime that estimate query execution time with higher accuracy. Second, we can ask whether to override prediction when the “predictability” is low. We show that considering predictability accelerates the query by achieving a higher recall. With this prediction, we propose to accelerate the queries that are predicted to be long-running. In our preliminary work, we focused on parallelization as an acceleration scenario. We extend to consider heterogeneous multicore hardware for acceleration. This hardware combines processor cores with different microarchitectures such as energy-efficient little cores and high-performance big cores, and accelerating web search using this hardware has remained an open problem. We evaluate the proposed prediction framework in two scenarios: (1) query parallelization on a multicore processor and (2) query scheduling on a heterogeneous processor. Our extensive evaluation results show that, for both scenarios of query acceleration using parallelization and heterogeneous cores, the proposed framework is effective in reducing the extreme tail latency compared to a start-of-the-art predictor because of its higher recall, and it improves server throughput by more than 70% because of its improved precision.

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Networks and Communications

Reference42 articles.

1. Design trade-offs for search engine caching

2. R. Baeza-Yates V. Murdock and C. Hauff. 2009. Efficiency trade-offs in two-tier web search systems. In SIGIR. 10.1145/1571941.1571971 R. Baeza-Yates V. Murdock and C. Hauff. 2009. Efficiency trade-offs in two-tier web search systems. In SIGIR. 10.1145/1571941.1571971

3. M. Becchi and P. Crowley. 2006. Dynamic thread assignment on heterogeneous multiprocessor architectures. ACM Computing Frontiers (2006). 10.1145/1128022.1128029 M. Becchi and P. Crowley. 2006. Dynamic thread assignment on heterogeneous multiprocessor architectures. ACM Computing Frontiers (2006). 10.1145/1128022.1128029

4. C. Bienia S. Kumar J. P. Singh and K. Li. 2008. The PARSEC benchmark suite: Characterization and architectural implications. Technical Report (2008). C. Bienia S. Kumar J. P. Singh and K. Li. 2008. The PARSEC benchmark suite: Characterization and architectural implications. Technical Report (2008).

5. Z. Bosnic and I. Kononenko. 2008. Comparison of approaches for estimating reliability of individual regression predictions. Data Knowledge Engineering (2008). 10.1016/j.datak.2008.08.001 Z. Bosnic and I. Kononenko. 2008. Comparison of approaches for estimating reliability of individual regression predictions. Data Knowledge Engineering (2008). 10.1016/j.datak.2008.08.001

Cited by 10 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Many are Better than One: Algorithm Selection for Faster Top-K Retrieval;Information Processing & Management;2023-07

2. Index-Based Batch Query Processing Revisited;Lecture Notes in Computer Science;2023

3. A survey on hardware accelerators: Taxonomy, trends, challenges, and perspectives;Journal of Systems Architecture;2022-08

4. Anytime Ranking on Document-Ordered Indexes;ACM Transactions on Information Systems;2022-01-31

5. A DFT-Based Running Time Prediction Algorithm for Web Queries;Future Internet;2021-08-04

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3