Author:
Istiqamah A. Nurul,Wiharja Kemas Rahmat Saleh
Abstract
The data warehouse is a very famous solution for analyzing business data from heterogeneous sources. Unfortunately, a data warehouse only can analyze structured data. Whereas, nowadays, thanks to the popularity of social media and the ease of creating data on the web, we are experiencing a flood of unstructured data. Therefore, we need an approach that can "structure" the unstructured data into structured data that can be processed by the data warehouse. To do this, we propose a schema extraction approach using Google Cloud Platform that will create a schema from unstructured data. Based on our experiment, our approach successfully produces a schema from unstructured data. To the best of our knowledge, we are the first in using Google Cloud Platform for extracting a schema. We also prove that our approach helps the database developer to understand the unstructured data better.
Publisher
School of Computing, Telkom University
Reference18 articles.
1. A. A. Alqarni and E. Pardede, “Integration of data warehouse and unstructured business documents,” in Proceedings of the 2012 15th International Conference on Network-Based Information Systems, NBIS 2012, 2012, pp. 32–37. doi: 10.1109/NBiS.2012.59.
2. E. Gallinucci, M. Golfarelli, and S. Rizzi, “Schema Profiling of Document-Oriented Databases,” Information Systems, vol. 75, pp. 13–25, Jun. 2018, doi: 10.1016/j.is.2018.02.007.
3. S. Bouaziz, A. Nabli, and F. Gargouri, “Design a Data Warehouse Schema from Document-Oriented Database,” in Procedia Computer Science, 2019, vol. 159, pp. 221–230. doi: 10.1016/j.procs.2019.09.177.
4. M. I. Halim, “Penerapan Document Oriented Database (NOSQL) Dalam Pembuatan E-LIBRARY Universitas Pendidikan Indonesia Menggunakan Mongodb Dan PHP,” 2016.
5. S. Tiwari, Professional NoSQL. Indianapolis: John Wiley & Sons, Inc., 2011.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献