Web scraping technique for producing Iranian consumer price index

Author:

Faramarzi Ayoub1,Hadizadeh Reza2,Fayyaz Saeed2,Sajadimanesh Sohrab2,Moradi Abbas1

Affiliation:

1. Statistical Research and Training Center, Tehran, Iran

2. Statistical Center of Iran, Tehran, Iran

Abstract

Data pervasiveness was made possible by the advent of new technologies such as the Internet and the World Wide Web in every human and non-human activity. This created an exponential increase or data explosion in data generation, coined under the term Big data. Alternatively, Big Data sources can contribute to the reduction of the response burden or they can be used only to study some economic or social phenomena before designing a statistical survey which is inherently expensive to pilot. Also, incorporating Big Data sources into official statistics means maintaining a net competitive advantage and relevance of the official statistics products compared to those provided by a plethora of commercial players, with reference to large corporations that are active in the field of information technology. In this paper, the web scraping technique was used to extract the daily prices of the food and drinks products in order to replace them with conventional prices which had been used for price indices. Moreover, these sorts of new datasets enable us to calculate the indices in smaller time scales like weekly or daily basis in comparison to the conventional approach which is possible only on monthly basis. Although web scraping has its own problems, it is more economically friendly, accurate, and time-saving, especially in urban areas. Findings revealed that the web scraping technique can be applied as an effective alternative to conventional methods for CPI. Also, this technique can be used for other price statistics.

Publisher

IOS Press

Subject

Statistics, Probability and Uncertainty,Economics and Econometrics,Management Information Systems

Reference3 articles.

1. Basic statistics of jevons and carli indices under the gbm price model;Bialek;Journal of Official Statistics,2020

2. Automated data collection from web sources for official statistics: First experiences;Hoekstra;Statistical Journal of the IAOS,2012

3. Web scraping techniques to collect data on consumer electronics and airfares for Italian HICP compilation;Polidoro;Statistical Journal of the IAOS,2015

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3