Affiliation:
1. Department of CSE, National Institute of Technology, Nagpur, Maharashtra 440010, India
Abstract
Nowadays, every manufacturer or retailer displays their product information on various websites. The customer has to visit, the number of such web pages to choose the right product, because the information is not available at one place. There are some websites that show such information in one place, but they are product specific and in general information is manually updated. In this paper, we propose a novel concept of web-spreadsheet, which displays product information by crawling through related web pages and generates information like a spreadsheet where each row represents product information and each column represents product attributes. We are extracting the product name of specified product class using decision tree-based classifier by features obtained using Part of Speech (POS) tagging and distance measure. It also extracts the value-measure pairs of preset attributes using distance measure, POS tagging and Data type. This approach will save a lot of time of comparing different products and customers need not have to scan a number of websites for comparison. We present promising results in various product classes which surpass many existing techniques in the literature. The proposed method can work accurately without initial trained labeled data which is expensive to obtain.
Publisher
World Scientific Pub Co Pte Lt
Subject
Artificial Intelligence,Computer Graphics and Computer-Aided Design,Computer Networks and Communications,Software
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献