Scalable Data Analysis Application to Web Usage Data

Author:

Chebi Hocine1

Affiliation:

1. Faculty of Electrical Engineering, Djillali Liabes University, Sidi Bel Abbes. Algeria

Abstract

The number of hits to web pages continues to grow. The web has become one of the most popular platforms for disseminating and retrieving information. Consequently, many website operators are encouraged to analyze the use of their sites in order to improve their response to the expectations of internet users. However, the way a website is visited can change depending on a variety of factors. Usage models must therefore be continuously updated in order to accurately reflect visitor behavior. This remains difficult when the time dimension is neglected or simply introduced as an additional numeric attribute in the description of the data. Data mining is defined as the application of data analysis and discovery algorithms on large databases with the goal of discovering non-trivial models. Several algorithms have been proposed in order to formalize the new models discovered, to build more efficient models, to process new types of data, and to measure the differences between the data sets. However, the most traditional algorithms of data mining assume that the models are static and do not take into account the possible evolution of these models over time. These considerations have motivated significant efforts in the analysis of temporal data as well as the adaptation of static data mining methods to data that evolves over time. The review of the main aspects of data mining dealt with in this thesis constitutes the body of this chapter, followed by a state of the art of current work in this field as well as a discussion of the major issues that exist there. Interest in temporal databases has increased considerably in recent years, for example in the fields of finance, telecommunications, surveillance, etc. A growing number of prototypes and systems are being implemented to take into account the time dimension of data explicitly, for example to study the variability over time of analysis results. To model an application, it is necessary to choose a common language, precise and known by all members of a team. UML (unified modeling language, in English, or unified modeling language, in French) is an object-oriented modeling language standardized by the OMG. This chapter aims to present the modeling with the diagrams of packages and classes built using UML. This chapter presents the conceptual model of the data, and finally, the authors specify the SQL queries used for the extraction of descriptive statistical variables of the navigations from a warehouse containing the preprocessed usage data.

Publisher

IGI Global

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3