Affiliation:
1. Lixto Software GmbH, Vienna, Austria
2. Oxford University, Oxford, UK
Abstract
Online market intelligence (OMI), in particular competitive intelligence for product pricing, is a very important application area for Web data extraction. However, OMI presents non-trivial challenges to data extraction technology. Sophisticated and highly parameterized navigation and extraction tasks are required. On-the-fly data cleansing is necessary in order two identify identical products from different suppliers. It must be possible to smoothly define data flow scenarios that merge and filter streams of extracted data stemming from several Web sites and store the resulting data into a data warehouse, where the data is subjected to market intelligence analytics. Finally, the system must be highly scalable, in order to be able to extract and process massive amounts of data in a short time. Lixto (www.lixto.com), a company offering data extraction tools and services, has been providing OMI solutions for several customers. In this paper we show how Lixto has tackled each of the above challenges by improving and extending its original data extraction software. Most importantly, we show how high scalability is achieved through cloud computing. This paper also features a case study from the computers and electronics market.
Subject
General Earth and Planetary Sciences,Water Science and Technology,Geography, Planning and Development
Cited by
25 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Adventures with Datalog: Walking the Thin Line Between Theory and Practice;AIxIA 2022 – Advances in Artificial Intelligence;2023
2. STEM: a suffix tree-based method for web data records extraction;Knowledge and Information Systems;2017-05-09
3. Framework for a Hospitality Big Data Warehouse;International Journal of Information Systems in the Service Sector;2017-04
4. KESeDa;Proceedings of the 12th International Conference on Semantic Systems;2016-09-12
5. AutoRM: An effective approach for automatic Web data record mining;Knowledge-Based Systems;2015-11