First steps towards improving official statistics data accessibility in Mexico: Query expansion with neural networks and ad-hoc space vectors

Author:

Pimentel Alejandro1,Díaz Oswaldo1,Villaseñor Elio1,Jiménez Jose-Luis2

Affiliation:

1. Instituto Nacional de Estadística y Geografía, Aguascalientes, Mexico

2. Department of Computer Science and Engineering, Universidad Carlos III de Madrid, Madrid, Spain

Abstract

Mexico’s National Institute of Statistics and Geography (INEGI) is exploring new opportunities to improve its information search service, with the aim of increasing the accessibility of official statistical data. The upgraded search engine will include a new component that offers more sophisticated search capabilities. These include the ability to conduct intelligent searches that do not require an exact match of the search text, as well as the expansion of searches using related ad-hoc terms. Additionally, the new component will provide feedback through the most appropriate relations. To achieve this, the system will utilize neural network-based distributional word representation systems to identify relationships between related terms. The vector spaces and representation will be manipulated to keep connections within the most relevant vocabulary for the institute’s type of searches. The usability testing department at the institute conducted blind pilot tests to compare the quality reported by users with and without the new enhancements. Although the evaluation survey showed significant improvements in the search engine’s performance, the tool presented is just the first step towards a system that allows continuous interaction and feedback with users to improve the quality of the responses presented. This strategy is not currently implemented by the institute, making this an immediate and easy-to-replicate approach for obtaining useful interactions with users.

Publisher

IOS Press

Subject

Statistics, Probability and Uncertainty,Economics and Econometrics,Management Information Systems

Reference27 articles.

1. The use of metadata modelling for the modernisation of information management of statistical classifications;Hancock;Statistical Journal of the IAOS,2020

2. Official statistics as a safeguard against fake news;Sæbø;Statistical Journal of the IAOS,2020

3. A Deep Look into neural ranking models for information retrieval;Guo;Information Processing & Management,2020

4. Apache Lucene 4;Białecki;SIGIR 2012 workshop on open source information retrieval,2012

5. Anserini;Yang;Journal of Data and Information Quality,2018

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3