Migrating 120,000 Legacy Publications from Several Systems into a Current Research Information System Using Advanced Data Wrangling Techniques

Author:

Lappalainen Yrjö1ORCID,Lassila Matti2ORCID,Heikkilä Tanja3ORCID,Nieminen Jani2ORCID,Lehtilä Tapani2ORCID

Affiliation:

1. Library and Learning Commons, Zayed University, Dubai P.O. Box 19282, United Arab Emirates

2. Tampere University Library, Tampere University, 33014 Tampere, Finland

3. Finnish Geospatial Research Institute (FGI), National Land Survey of Finland (NLS), 02150 Espoo, Finland

Abstract

This article describes a complex CRIS (current research information system) implementation project involving the migration of around 120,000 legacy publication records from three different systems. The project, undertaken by Tampere University, encountered several challenges in data diversity, data quality, and resource allocation. To handle the extensive and heterogenous dataset, innovative approaches such as machine learning techniques and various data wrangling tools were used to process data, correct errors, and merge information from different sources. Despite significant delays and unforeseen obstacles, the project was ultimately successful in achieving its goals. The project served as a valuable learning experience, highlighting the importance of data quality and standardized practices, and the need for dedicated resources in handling complex data migration projects in research organizations. This study stands out for its comprehensive documentation of the data wrangling and migration process, which has been less explored in the context of CRIS literature.

Publisher

MDPI AG

Subject

Computer Science Applications,Media Technology,Communication,Business and International Management,Library and Information Sciences

Reference50 articles.

1. OKM (2023, July 02). Korkeakouluille uusi Rahoitusmalli. Available online: https://okm.fi/-/korkeakouluille-uusi-rahoitusmalli.

2. Pölönen, J., Pylvänäinen, E., Aspara, J., Puuska, H.M., and Rinne, R. (2021). Publication Forum 2010–2020: Self-evaluation report of the Finnish quality classification system of peer-reviewed publication channels. Web Publ. Fed. Finn. Learn. Soc., 9.

3. Puuska, H.M. (2023, May 09). The Research Information Hub as an Access Point to Finnish Research [PowerPoint Slides]. EuroCRIS Spring 2019 Membership Meeting. Available online: https://hdl.handle.net/11366/986.

4. Laitinen, S., Sutela, P., and Tirronen, K. (2000, January 25–27). Development of Current Research Information Systems in Finland. Proceedings of the 5th Conference on Current Research Information Systems (CRIS 2000), Helsinki, Finland. Available online: https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=2db2be9d764dfbbd31e75220936e5682c05f9193.

5. Current Research Information Systems (CRIS): Past, Present and Future;Asserson;Wissenschaftsmanagement,2009

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3