Affiliation:
1. Indian Institute of Technology Bombay
Abstract
With over 800 million pages covering most areas of human endeavor, the World-wide Web is a fertile ground for data mining research to make a difference to the effectiveness of information search. Today, Web surfers access the Web through two dominant interfaces: clicking on hyperlinks and searching via keyword queries. This process is often tentative and unsatisfactory. Better support is needed for expressing one's information need and dealing with a search result in more structured ways than available now. Data mining and machine learning have significant roles to play towards this end.In this paper we will survey recent advances in learning and mining problems related to hypertext in general and the Web in particular. We will review the continuum of supervised to semi-supervised to unsupervised learning problems, highlight the specific challenges which distinguish data mining in the hypertext domain from data mining in the context of data warehouses, and summarize the key areas of recent and ongoing research.
Publisher
Association for Computing Machinery (ACM)
Reference71 articles.
1. Automatic subspace clustering of high dimensional data for data mining applications
2. Automatic hypertext link typing
3. D. J. Arnold L. Balkan R. L. Humphreys S. Meijer and L. Sadler. Machine translation: An introductory guide 1995. Online at http://clwww.essex.ac.uk/~doug/book/book.html. D. J. Arnold L. Balkan R. L. Humphreys S. Meijer and L. Sadler. Machine translation: An introductory guide 1995. Online at http://clwww.essex.ac.uk/~doug/book/book.html.
4. Babelfish Language Translation Service. http://www.altavista.com 1998. Babelfish Language Translation Service. http://www.altavista.com 1998.
Cited by
106 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献