Abstract
Despite the growth in digital text collections, the ability to retrieve words or phrases with specific attributes is limited, for example, to retrieve words with a specific meaning within a specific section of a text. Many systems work with coarse bibliographic metadata. To enable fine-grained retrieval, it is necessary to encode texts with granular metadata. Sample texts were encoded with granular metadata. Five categories of metadata that can be used to capture additional data about texts were used, namely, morphological, syntactic, semantic, functional and bibliographic. A prototype was developed to parse the encoded texts and store the information in a database. The prototype was used to test the extent to which words or phrases with specific attributes could be retrieved. Retrieval on a detailed level was possible through the prototype. Retrieval using all five categories of metadata was demonstrated, as well as advanced searches using metadata from different categories in a single search. This article demonstrates that when granular metadata is used to encode texts, retrieval is improved. Relevant information can be selected, and irrelevant information can be excluded, even within a text.
Subject
Metals and Alloys,Mechanical Engineering,Mechanics of Materials
Reference25 articles.
1. Ball, Liezl H. 2020. “Enhancing Digital Text Collections with Detailed Metadata to Improve Retrieval.” PhD diss., University of Pretoria. http://hdl.handle.net/2263/79015
2. Ball, Liezl H., and Theo J. D. Bothma. 2022. “Investigating the Extent to Which Words or Phrases with Specific Attributes Can Be Retrieved from Digital Text Collections.” Information Research 27 (1): 917. https://doi.org/10.47989/irpaper917
3. Cox, Andrew M. 2021. Research Report: The Impact of AI, Machine Learning, Automation and Robotics on the Information Professions. CILIP (The Library and Information Association). Accessed April 27, 2022. https://www.cilip.org.uk/page/researchreport
4. Edmond, Jennifer, and Jörg Lehmann. 2021. “Digital Humanities, Knowledge Complexity, and the Five ‘Aporias’ of Digital Research.” Digital Scholarship in the Humanities 36 (2): ii95–ii108. https://doi.org/https://doi.org/10.1093/llc/fqab031
5. Fenlon, Katrina, Megan Senseney, Harriett Green, Sayan Bhattacharyya, Craig Willis, and J. Stephen Downie. 2014. “Scholar‐Built Collections: A Study of User Requirements for Research in Large‐Scale Digital Libraries.” Proceedings of the American Society for Information Science and Technology 51 (1): 1–10. https://doi.org/https://doi.org/10.1002/meet.2014.14505101047