Abstract
AbstractDecision-making in our everyday lives is surrounded by visually important information. Fashion, housing, dating, food or travel are just a few examples. At the same time, most commonly used tools for information retrieval operate on relational and text-based search models which are well understood by end users, but unable to directly cover visual information contained in images or videos. Researcher communities have been trying to reveal the semantics of multimedia in the last decades with ever-improving results, dominated by the success of deep learning. However, this does not close the gap to relational retrieval model on its own and often rather solves a very specialized task like assigning one of pre-defined classes to each object within a closed application ecosystem. Retrieval models based on these novel techniques are difficult to integrate in existing application-agnostic environments built around relational databases, and therefore, they are not so widely used in the industry. In this paper, we address the problem of closing the gap between visual information retrieval and relational database model. We propose and formalize a model for discovering candidates for new relational attributes by analysis of available visual content. We design and implement a system architecture supporting the attribute extraction, suggestion and acceptance processes. We apply the solution in the context of e-commerce and show how it can be seamlessly integrated with SQL environments widely used in the industry. At last, we evaluate the system in a user study and discuss the obtained results.
Funder
Grantová Agentura České Republiky
Publisher
Springer Science and Business Media LLC
Subject
Artificial Intelligence,Hardware and Architecture,Human-Computer Interaction,Information Systems,Software
Reference60 articles.
1. Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th international conference on very large data bases, VLDB ’94, pp 487–499. Morgan Kaufmann Publishers Inc., San Francisco, CA. http://dl.acm.org/citation.cfm?id=645920.672836
2. Baltrušaitis T, Ahuja C, Morency L (2019) Multimodal machine learning: a survey and taxonomy. IEEE Trans Pattern Anal Mach Intell 41(2):423–443
3. Beecks C, Skopal T, Schoeffmann K, Seidl T (2011) Towards large-scale multimedia exploration. In: Das G, Hsristidis V, Ilyas I (eds) Proceedings of the 5th international workshop on ranking in databases (DBRank 2011). VLDB, Seattle, WA, pp 31–33
4. Berg TL, Berg AC, Shih J (2010) Automatic attribute discovery and characterization from noisy web data. In: Daniilidis K, Maragos P, Paragios N (eds) Computer vision—ECCV 2010. Springer, Berlin, pp 663–676
5. Brodén B, Hammar M, Nilsson BJ, Paraschakis D (2018) Ensemble recommendations via thompson sampling: an experimental study within e-commerce. In: 23rd International conference on intelligent user interfaces, IUI ’18, pp 19–29. ACM, New York, NY. https://doi.org/10.1145/3172944.3172967
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献