Affiliation:
1. Google Inc.
2. Yahoo! Research, Sunnyvale
Abstract
Top-k retrieval over main-memory inverted indexes is at the core of many modern applications: from large scale web search and advertising platforms, to text extenders and content management systems. In these systems, queries are evaluated using two major families of algorithms:
document-at-a-time
(DAAT) and
term-at-a-time
(TAAT). DAAT and TAAT algorithms have been studied extensively in the research literature, but mostly in disk-based settings. In this paper, we present an analysis and comparison of several DAAT and TAAT algorithms used in Yahoo!'s production platform for online advertising. The low-latency requirements of online advertising systems mandate memory-resident indexes. We compare the performance of several query evaluation algorithms using two real-world ad selection datasets and query workloads. We show how some adaptations of the original algorithms for main memory setting have yielded significant performance improvement, reducing running time and cost of serving by 60% in some cases. In these results both the original and the adapted algorithms have been evaluated over memory-resident indexes, so the improvements are algorithmic and not due to the fact that the experiments used main memory indexes.
Subject
General Earth and Planetary Sciences,Water Science and Technology,Geography, Planning and Development
Cited by
33 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献