Process Model for Content Extraction from Weblogs-Reference-Cited by-同舟云学术

Process Model for Content Extraction from Weblogs

Published:2014-04 Issue:2 Volume:10 Page:20-36
ISSN:1548-3657
Container-title:International Journal of Intelligent Information Technologies
language:en
Short-container-title:

Author:

Schieber Andreas¹,Hilbert Andreas¹

Affiliation:

1. University of Technology Dresden, Dresden, Germany

Abstract

This paper develops and evaluates a BPMN-based process model which identifies and extracts blog content from the web and stores its textual data in a data warehouse for further analyses. Depending on the characteristics of the technologies used to create the weblogs, the process has to perform specific tasks in order to extract blog content correctly. The paper describes three phases: extraction, transformation and loading of data in a repository specifically adapted for blog content extraction. It highlights the objectives in these phases which must be achieved to ensure the correct extraction. The authors integrate the described process in a previously developed framework for blog mining. The authors' process model closes the conceptual gap in this framework as well as the gap in current research of blog mining process models. Furthermore, it can easily be adapted for other web extraction proposals.

Publisher

IGI Global

Subject

Decision Sciences (miscellaneous),Information Systems

Reference73 articles.

1. Using text mining to uncover students' technology-related problems in live video streaming

2. Identifying the influential bloggers in a community

3. BPMN-Based Conceptual Modeling of ETL Processes

4. Intelligent Information Integration

5. Attardi, G., & Simi, M. (2006). Blog mining through opinionated words. Retrieved from http://trec.nist.gov/pubs/trec15/papers/upisa.blog.final.pdf

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A Novel Bio-Inspired Approach for Multilingual Spam Filtering;International Journal of Intelligent Information Technologies;2015-07

2. Streamlined Alarms for Intrusion Recognition System;International Journal of Intelligent Information Technologies;2015-04