Complementary and Integrative Health Information in the literature: its lexicon and named entity recognition

Author:

Zhou Huixue1ORCID,Austin Robin2ORCID,Lu Sheng-Chieh3,Silverman Greg Marc4,Zhou Yuqi15,Kilicoglu Halil6ORCID,Xu Hua7,Zhang Rui4ORCID

Affiliation:

1. Institute for Health Informatics, University of Minnesota , Minneapolis, MN, United States

2. School of Nursing, University of Minnesota , Minneapolis, MN, United States

3. Department of Symptom Research, The University of Texas MD Anderson Cancer Center , Houston, TX, United States

4. Department of Surgery, University of Minnesota , Minneapolis, MN, United States

5. Department of Pharmaceutical Care & Health Systems, University of Minnesota , Minneapolis, MN, United States

6. School of Information Sciences, University of Illinois Urbana-Champaign , Champaign, IL, United States

7. Section of Biomedical Informatics and Data Science, School of Medicine, Yale University , New Haven, CT, United States

Abstract

Abstract Objective To construct an exhaustive Complementary and Integrative Health (CIH) Lexicon (CIHLex) to help better represent the often underrepresented physical and psychological CIH approaches in standard terminologies, and to also apply state-of-the-art natural language processing (NLP) techniques to help recognize them in the biomedical literature. Materials and methods We constructed the CIHLex by integrating various resources, compiling and integrating data from biomedical literature and relevant sources of knowledge. The Lexicon encompasses 724 unique concepts with 885 corresponding unique terms. We matched these concepts to the Unified Medical Language System (UMLS), and we developed and utilized BERT models comparing their efficiency in CIH named entity recognition to well-established models including MetaMap and CLAMP, as well as the large language model GPT3.5-turbo. Results Of the 724 unique concepts in CIHLex, 27.2% could be matched to at least one term in the UMLS. About 74.9% of the mapped UMLS Concept Unique Identifiers were categorized as “Therapeutic or Preventive Procedure.” Among the models applied to CIH named entity recognition, BLUEBERT delivered the highest macro-average F1-score of 0.91, surpassing other models. Conclusion Our CIHLex significantly augments representation of CIH approaches in biomedical literature. Demonstrating the utility of advanced NLP models, BERT notably excelled in CIH entity recognition. These results highlight promising strategies for enhancing standardization and recognition of CIH terminology in biomedical contexts.

Funder

National Center for Complementary and Integrative Health

National Institution on Aging

National Institutes of Health

Publisher

Oxford University Press (OUP)

Subject

Health Informatics

Reference40 articles.

1. Trends in the use of complementary health approaches among adults: United States, 2002-2012;Clarke;Natl Health Stat Report,2015

2. ICD-11: an international classification of diseases for the twenty-first century;Harrison;BMC Med Inform Decis Mak,2021

Cited by 5 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3