Document Difficulty Aspects for Medical Practitioners: Enhancing Information Retrieval in Personalized Search Engines

Author:

Frihat Sameh1,Beckmann Catharina Lena2ORCID,Hartmann Eva Maria2,Fuhr Norbert1ORCID

Affiliation:

1. Department of Information Engineering, University of Duisburg-Essen, 47057 Duisburg, Germany

2. Department of Computer Science, University of Applied Sciences and Arts Dortmund, 44227 Dortmund, Germany

Abstract

Timely and relevant information enables clinicians to make informed decisions about patient care outcomes. However, discovering related and understandable information from the vast medical literature is challenging. To address this problem, we aim to enable the development of search engines that meet the needs of medical practitioners by incorporating text difficulty features. We collected a dataset of 209 scientific research abstracts from different medical fields, available in both English and German. To determine the difficulty aspects of readability and technical level of each abstract, 216 medical experts annotated the dataset. We used a pre-trained BERT model, fine-tuned to our dataset, to develop a regression model predicting those difficulty features of abstracts. To highlight the strength of this approach, the model was compared to readability formulas currently in use. Analysis of the dataset revealed that German abstracts are more technically complex and less readable than their English counterparts. Our baseline model showed greater efficacy than current readability formulas in predicting domain-specific readability aspects. Conclusion: Incorporating these text difficulty aspects into the search engine will provide healthcare professionals with reliable and efficient information retrieval tools. Additionally, the dataset can serve as a starting point for future research.

Funder

DFG Research Training Group 2535

University of Duisburg-Essen

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Reference46 articles.

1. Relationships of Measures of Interest, Prior Knowledge, and Readability to Comprehension of Expository Passages;Entin;Adv. Read./Lang. Res.,1985

2. Vydiswaran, V.V., Mei, Q., Hanauer, D.A., and Zheng, K. (November, January 30). Mining consumer health vocabulary from community-generated text. Proceedings of the AMIA Annual Symposium Proceedings, American Medical Informatics Association, San Diego, CA, USA.

3. Chall, J. (1958). Readability: An Appraisal of Research and Application, Bureau of Educational Research Monographs.

4. Hätty, A., Schlechtweg, D., Dorna, M., and im Walde, S.S. (2020, January 5–10). Predicting degrees of technicality in automatic terminology extraction. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, London, UK.

5. Using the SMOG formula to revise a health-related document;Hedman;Am. J. Health Educ.,2008

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3