Web scraping for price statistics in the Philippines

Author:

Albis Manuel Leonard F.1,Romasoc Sabrina O.2,Pelayo Shushimita G.2,Gavira Bea Andrea C.2,Asombrado Jazzen Paul J.2

Affiliation:

1. School of Statistics, University of the Philippines, Diliman, Quezon City, Metro Manila, Philippines

2. Philippine Statistical Research and Training Institute, Quezon City, Metro Manila, Philippines

Abstract

Official price statistics in the Philippines are mainly sourced from the conduct of regular surveys and censuses which entail high costs. As businesses move into digital platforms, alternatives to these traditional data sources have become more available; one of which is web scraping, a process of collecting information from the web. As digital and online platforms become increasingly utilized for commerce, web scraping offers a way to increase the frequency of data collection while reducing its cost compared to price surveys. This paper provides a survey of experiences of various government statistical agencies in their conduct of web scraping for the Consumer Price Index (CPI). Moreover, it details the Philippines’ experience using web scraped data to estimate the food and alcoholic beverages CPI of the National Capital Region in the Philippines, and that is compared to the official CPI estimate of the Philippine Statistics Authority. Finally, this paper discusses the challenges encountered and the recommendations for enhancing the approach.

Publisher

IOS Press

Subject

Statistics, Probability and Uncertainty,Economics and Econometrics,Management Information Systems

Reference14 articles.

1. Using Internet Data for Economic Research;Edelman;Journal of Economic Perspectives.,2012

2. Cavallo A. Scraped Data and Sticky Prices. National Bureau of Economic Research Working Paper Series, 2015.

3. Bosch O, Windmeijer D, van Delden A, van den Heuvel G. Web Scraping Meets Survey Design: Combining Forces. Big Data Meets Survey Science Conference, 25–27 October 2018, Barcelona, Spain; 2018.

4. Auer J, Boettcher I. From Price Collection to Price Data Analytics: How New Large Data Sources Require Price Statisticians to Re-think Their Index Compilation Procedures. Experiences from Web-Scraped and Scanner Data. Statistics Austria, 2016.

5. Nygaard R. The Use of Online Prices in the Norwegian Consumer Price Index. Statistics Norway, 2015.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3