Micro-Mining and Segmented Log File Analysis: A Method for Enriching the Data Yield from Internet Log Files

Author:

Nicholas David1,Huntington Paul2

Affiliation:

1. Ciber (Centre for Information Behaviour and the Evaluation of Research), Department of Information Science, City University, London,

2. Ciber (Centre for Information Behaviour and the Evaluation of Research), Department of Information Science, City University, London

Abstract

The authors propose improved ways of analysing web server log files. Traditionally web site statistics focus on giving a big (and shallow) picture analysis based on all transaction log entries. The pictures are, however, distorted because of the problems associated with resolving Internet protocol (IP) numbers to a single user and cross-border IP registration. The authors argue that analysing extracted sub-groups and categories presents a more accurate picture of the data and that the analysis of the online behaviour of selected individuals (rather than of very large groups) can add much to our understanding of how people use web sites and, indeed, any digital information source. The analysis is labelled `micro' to distinguish it from traditional macro, big picture transactional log analysis. The methods are illustrated with recourse to the logs of the Surgery Door (www.surgerydoor.co.uk) consumer health web site. It was found that use attributed to academic users gave a better approximation of the sites' geographical distribution of users than an analysis based on all users. This occurs as academic institutions, unlike other user types, register in their host country. Selecting log entries where each user is allocated a unique IP number can be particularly beneficial, especially to analyses of returnees. Finally the paper tracks the online behaviour of a small number of IP numbers, in an example of the application of microanalysis,

Publisher

SAGE Publications

Subject

Library and Information Sciences,Information Systems

Cited by 22 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3