Unsupervised embedding of trajectories captures the latent structure of scientific migration

Author:

Murray Dakota12ORCID,Yoon Jisung134,Kojaku Sadamori1ORCID,Costas Rodrigo56ORCID,Jung Woo-Sung78ORCID,Milojević Staša1,Ahn Yong-Yeol1ORCID

Affiliation:

1. Center for Complex Networks and Systems Research, Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN 47408

2. Network Science Institute at Northeastern University, Boston, MA 02115

3. Kellogg School of Management & Organizations at Northwestern University, Evanston, IL 60208

4. Northwestern Institute on Complex Systems, Evanston, IL 60208

5. Centre for Science and Technology Studies, Leiden University, 2300 AXLeiden, The Netherlands

6. Centre of Excellence in Scientometrics and Science, Technology and Innovation Policy, Stellenbosch University, Stellenbosch 7600, South Africa

7. Department of Physics, Pohang University of Science and Technology, Pohang 37673, South Korea

8. Department of Industrial and Management Engineering, Pohang University of Science and Technology, Pohang 37673, South Korea

Abstract

Human migration and mobility drives major societal phenomena including epidemics, economies, innovation, and the diffusion of ideas. Although human mobility and migration have been heavily constrained by geographic distance throughout the history, advances, and globalization are making other factors such as language and culture increasingly more important. Advances in neural embedding models, originally designed for natural language, provide an opportunity to tame this complexity and open new avenues for the study of migration. Here, we demonstrate the ability of the model word2vec to encode nuanced relationships between discrete locations from migration trajectories, producing an accurate, dense, continuous, and meaningful vector-space representation. The resulting representation provides a functional distance between locations, as well as a “digital double” that can be distributed, re-used, and itself interrogated to understand the many dimensions of migration. We show that the unique power of word2vec to encode migration patterns stems from its mathematical equivalence with the gravity model of mobility. Focusing on the case of scientific migration, we apply word2vec to a database of three million migration trajectories of scientists derived from the affiliations listed on their publication records. Using techniques that leverage its semantic structure, we demonstrate that embeddings can learn the rich structure that underpins scientific migration, such as cultural, linguistic, and prestige relationships at multiple levels of granularity. Our results provide a theoretical foundation and methodological framework for using neural embeddings to represent and understand migration both within and beyond science.

Funder

DOD | USAF | AMC | Air Force Office of Scientific Research

National Science Foundation

Publisher

Proceedings of the National Academy of Sciences

Subject

Multidisciplinary

Reference76 articles.

1. “Origins and destinations of the world’s migrants 1990–2017” (Tech. Rep. Pew Research Center Washington DC 2018). https://www.pewresearch.org/global/interactives/global-migrant-stocks-map/.

2. “Global flow of tertiary-level students” (Tech. Rep. UNESCO Institute of Statistics Paris France 2019).

3. The P 1 P 2 D Hypothesis: On the Intercity Movement of Persons

4. A universal model for mobility and migration patterns

5. Proximity and Innovation: A Critical Assessment

Cited by 2 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3