Prediction and Predictability for Search Query Acceleration-Reference-Cited by-同舟云学术

Prediction and Predictability for Search Query Acceleration

Published:2016-08-29 Issue:3 Volume:10 Page:1-28
ISSN:1559-1131
Container-title:ACM Transactions on the Web
language:en
Short-container-title:ACM Trans. Web

Author:

Hwang Seung-Won¹,Kim Saehoon²,He Yuxiong³,Elnikety Sameh³,Choi Seungjin²

Affiliation:

1. Yonsei University, Seoul, Korea

2. POSTECH

3. Microsoft Research

Abstract

A commercial web search engine shards its index among many servers, and therefore the response time of a search query is dominated by the slowest server that processes the query. Prior approaches target improving responsiveness by reducing the tail latency , or high-percentile response time, of an individual search server. They predict query execution time, and if a query is predicted to be long-running, it runs in parallel; otherwise, it runs sequentially. These approaches are, however, not accurate enough for reducing a high tail latency when responses are aggregated from many servers because this requires each server to reduce a substantially higher tail latency (e.g., the 99.99th percentile), which we call extreme tail latency. To address tighter requirements of extreme tail latency, we propose a new design space for the problem, subsuming existing work and also proposing a new solution space. Existing work makes a prediction using features available at indexing time and focuses on optimizing prediction features for accelerating tail queries. In contrast, we identify “when to predict?” as another key optimization question. This opens up a new solution of delaying a prediction by a short duration to allow many short-running queries to complete without parallelization and, at the same time, to allow the predictor to collect a set of dynamic features using runtime information. This new question expands a solution space in two meaningful ways. First, we see a significant reduction of tail latency by leveraging “dynamic” features collected at runtime that estimate query execution time with higher accuracy. Second, we can ask whether to override prediction when the “predictability” is low. We show that considering predictability accelerates the query by achieving a higher recall. With this prediction, we propose to accelerate the queries that are predicted to be long-running. In our preliminary work, we focused on parallelization as an acceleration scenario. We extend to consider heterogeneous multicore hardware for acceleration. This hardware combines processor cores with different microarchitectures such as energy-efficient little cores and high-performance big cores, and accelerating web search using this hardware has remained an open problem. We evaluate the proposed prediction framework in two scenarios: (1) query parallelization on a multicore processor and (2) query scheduling on a heterogeneous processor. Our extensive evaluation results show that, for both scenarios of query acceleration using parallelization and heterogeneous cores, the proposed framework is effective in reducing the extreme tail latency compared to a start-of-the-art predictor because of its higher recall, and it improves server throughput by more than 70% because of its improved precision.

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Networks and Communications

Link

https://dl.acm.org/doi/pdf/10.1145/2943784

Reference42 articles.

1. Design trade-offs for search engine caching

2. R. Baeza-Yates V. Murdock and C. Hauff. 2009. Efficiency trade-offs in two-tier web search systems. In SIGIR. 10.1145/1571941.1571971 R. Baeza-Yates V. Murdock and C. Hauff. 2009. Efficiency trade-offs in two-tier web search systems. In SIGIR. 10.1145/1571941.1571971

3. M. Becchi and P. Crowley. 2006. Dynamic thread assignment on heterogeneous multiprocessor architectures. ACM Computing Frontiers (2006). 10.1145/1128022.1128029 M. Becchi and P. Crowley. 2006. Dynamic thread assignment on heterogeneous multiprocessor architectures. ACM Computing Frontiers (2006). 10.1145/1128022.1128029

4. C. Bienia S. Kumar J. P. Singh and K. Li. 2008. The PARSEC benchmark suite: Characterization and architectural implications. Technical Report (2008). C. Bienia S. Kumar J. P. Singh and K. Li. 2008. The PARSEC benchmark suite: Characterization and architectural implications. Technical Report (2008).

5. Z. Bosnic and I. Kononenko. 2008. Comparison of approaches for estimating reliability of individual regression predictions. Data Knowledge Engineering (2008). 10.1016/j.datak.2008.08.001 Z. Bosnic and I. Kononenko. 2008. Comparison of approaches for estimating reliability of individual regression predictions. Data Knowledge Engineering (2008). 10.1016/j.datak.2008.08.001

Cited by 10 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Many are Better than One: Algorithm Selection for Faster Top-K Retrieval;Information Processing & Management;2023-07

2. Index-Based Batch Query Processing Revisited;Lecture Notes in Computer Science;2023

3. A survey on hardware accelerators: Taxonomy, trends, challenges, and perspectives;Journal of Systems Architecture;2022-08

4. Anytime Ranking on Document-Ordered Indexes;ACM Transactions on Information Systems;2022-01-31

5. A DFT-Based Running Time Prediction Algorithm for Web Queries;Future Internet;2021-08-04