Author:
Frommholz Ingo,Roelleke Thomas
Abstract
Abstract
Probabilistic Datalog (PDatalog, proposed in 1995) is a probabilistic variant of Datalog and a nice conceptual idea to model Information Retrieval in a logical, rule-based programming paradigm. Making PDatalog work in real-world applications requires more than probabilistic facts and rules, and the semantics associated with the evaluation of the programs. We report in this paper some of the key features of the HySpirit system required to scale the execution of PDatalog programs.
Firstly, there is the requirement to express probability estimation in PDatalog. Secondly, fuzzy-like predicates are required to model vague predicates (e.g. vague match of attributes such as age or price). Thirdly, to handle large data sets there are scalability issues to be addressed, and therefore, HySpirit provides probabilistic relational indexes and parallel and distributed processing. The main contribution of this paper is a consolidated view on the methods of the HySpirit system to make PDatalog applicable in real-scale applications that involve a wide range of requirements typical for data (information) management and analysis.
Publisher
Springer Science and Business Media LLC
Subject
General Earth and Planetary Sciences,General Environmental Science
Reference16 articles.
1. Azzam H, Yahyaei S, Bonzanini M, Roelleke T (2012) A schema-driven approach for knowledge-oriented retrieval and query formulation. In: Proceedings of the Third International Workshop on Keyword Search on Structured Data - KEYS '12. ACM, Scottsdale, AZ, USA. doi:10.1145/2254736.2254746. URL http://dl.acm.org/citation.cfm?doid=2254736.2254746
2. Cornacchia R, Kamps J, Alink W, de Vries AP (2013) Searching political data by strategy. In: Lupu M, Salampasis M, Fuhr N, Hanbury A, Larsen B, Strindberg H (eds) Proceedings of the Integrating IR technologies for Professional Search Workshop. CEUR-WS.org, Moscow, pp 88–91. http://ceur-ws.org/Vol-968/irps_15.pdf
3. Frommholz I, Fuhr N (2006) Probabilistic, object-oriented logics for annotation-based retrieval in digital libraries. In: Nelson M, Marshall C, Marchionini G (eds) Proc. of the 6th ACM/IEEE Joint Conference on Digital Libraries (JCDL 2006). ACM, New York, pp 55–64
4. Fuhr N (2000) Probabilistic datalog: implementing logical information retrieval for advanced applications. J Am Soc Inf Sci 51:95–110
5. Fuhr N (2014) Bridging information retrieval and databases. In: Ferro N (ed) Bridging between information retrieval and databases. Springer, Berlin, pp 97–115. doi:10.1007/978-3-642-54798-0fn{_}g5
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. BIRDS - Bridging the Gap between Information Science, Information Retrieval and Data Science;Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval;2020-07-25