Affiliation:
1. Accenture Technology Labs, Chicago, IL
2. Carnegie Mellon University, Pittsburgh, PA
Abstract
We describe our work on extracting attribute and value pairs from textual product descriptions. The goal is to augment databases of products by representing each product as a set of attribute-value pairs. Such a representation is beneficial for tasks where treating the product as a set of attribute-value pairs is more useful than as an atomic entity. Examples of such applications include demand forecasting, assortment optimization, product recommendations, and assortment comparison across retailers and manufacturers. We deal with both implicit and explicit attributes and formulate both kinds of extractions as classification problems. Using single-view and multi-view semi-supervised learning algorithms, we are able to exploit large amounts of unlabeled data present in this domain while reducing the need for initial labeled data that is expensive to obtain. We present promising results on apparel and sporting goods products and show that our system can accurately extract attribute-value pairs from product descriptions. We describe a variety of application that are built on top of the results obtained by the attribute extraction system.
Publisher
Association for Computing Machinery (ACM)
Reference18 articles.
1. Opinion observer
2. Combining labeled and unlabeled data with co-training
3. Maximum likelihood from incomplete data via the EM algorithm;Dempster A. P.;Journal of the Royal Statistical Society, Series B,1977
Cited by
114 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献