Affiliation:
1. University of Seoul, Seoul, Korea
Abstract
This article describes how, in the era of big data, a data warehouse is an integrated multidimensional database that provides the basis for the decision making required to establish crucial business strategies. Efficient, effective analysis requires a data organization system that integrates and manages data of various dimensions. However, conventional data warehousing techniques do not consider the various data manipulation operations required for data-mining activities. With the current explosion of text data, much research has examined text (or document) repositories to support text mining and document retrieval. Therefore, this article presents a method of developing a text warehouse that provides a machine-learning-based text classification service. The document is represented as a term-by-concept matrix using a 3rd-order tensor-based textual representation model, which emphasizes the meaning of words occurring in the document. As a result, the proposed text warehouse makes it possible to develop a semantic Naïve Bayes text classifier only by executing appropriate SQL statements.
Reference26 articles.
1. Committee-based sample selection for probabilistic classifiers.;S.Argamon-Engelson;Journal of Artificial Intelligence Research,1999
2. Data Warehouse Governance Programs in Healthcare Settings: A Literature
Review and a Call to Action
3. Building an XML document warehouse
4. Comparing SVM and naïve Bayes classifiers for text categorization with wikitology as knowledge enrichment.;S.Hassan;Proceedings of IEEE 14th Multitopic Conference,2011
5. Intelligent Multidimensional Modelling.;S.Hira;GSTF Journal on Computing,2014
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献