Abstract
Researchers need to be able to find, access, and use data to participate in open science. To understand how users search for research data, we analyzed textual queries issued at a large social science data archive, the Inter-university Consortium for Political and Social Research (ICPSR). We collected unique user queries from 988,475 user search sessions over four years (2012-16). Overall, we found that only 30% of site visitors entered search terms into the ICPSR website. We analyzed search strategies within these sessions by extending existing dataset search taxonomies to classify a subset of the 1,554 most popular queries. We identified five categories of commonly-issued queries: keyword-based (e.g., date, place, topic); name (e.g., study, series); identifier (e.g., study, series); author (e.g., institutional, individual); and type (e.g., file, format). While the dominant search strategy used short keywords to explore topics, directed searches for known items using study and series names were also common. We further distinguished exploratory browsing from directed search queries based on their page views, refinements, search depth, duration, and length. Directed queries were longer (i.e., they had more words), while sessions with exploratory queries had more refinements and associated page views. By comparing search interactions at ICPSR to other natural language interactions in similar web search contexts, we conclude that dataset search at ICPSR is underutilized. We envision how alternative search paradigms, such as those enabled by recommender systems, can enhance dataset search.
Publisher
University of Alberta Libraries
Reference56 articles.
1. Abebe, R., Hill, S., Vaughan, J. W., Small, P. M., & Andrew Schwartz, H. (2018). Using Search Queries to Understand Health Information Needs in Africa. In arXiv [cs.CY]. arXiv. https://doi.org/10.48550/arXiv.1806.05740
2. Akmon, D., Lafia, S., Thomer, A., Hemphill, L., Pienta, A., Yakel, E., Bleckley, D., & Tyler, A. (2020). Measuring and Improving the Efficacy of Curation Activities in Data Archives. https://hdl.handle.net/2027.42/163501
3. Aula, A., Jhaveri, N., & Käki, M. (2005). Information search and re-access strategies of experienced web users. Proceedings of the 14th International Conference on World Wide Web, 583–592. https://dl.acm.org/doi/10.1145/1060745.1060831
4. Baeza-Yates, R., & Ribeiro-Neto, B. (2011). Modern Information Retrieval: The Concepts and Technology Behind Search. Addison Wesley: Edinburgh.
5. Bates, M.J. (1989), "The design of browsing and berrypicking techniques for the online search interface", Online Review, Vol. 13 No. 5, pp. 407-424. https://doi.org/10.1108/eb024320