Author:
Grabar Natalia,Zweigenbaum Pierre
Abstract
Terminology structuring has been the subject of much work in the context of terms extracted from corpora: given a set of terms, obtained from an existing resource or extracted from a corpus, it consists in identifying hierarchical (or other types of) relations between these terms. The present work aims at assessing the feasibility of such structuring by studying it on an existing hierarchically structured terminology. Our overall goal is to test various structuring methods proposed in the literature and to check how they fare on this task. The specific goal at the present stage of our work, which we report here, is focussed on lexical methods that match terms on the basis on their content words, taking morphological variants and synonyms into account. We describe experiments performed on the French version of the US National Library of Medicine MeSH thesaurus. We compare the lexically-induced relations with the original MeSH relations and measure recall and precision metrics, taking two different views on the task: relation recovery and term placement. This method proposes correct term placement for up to 26% of the MeSH concepts, and its precision can reach 58%. After this quantitative evaluation, we perform a qualitative, human analysis of the ‘new’ relations not present in the MeSH. This analysis shows, on the one hand, the limits of the lexical structuring method. On the other hand, it reveals some specific structuring choices and naming conventions made by the MeSH designers, and emphasizes ontological commitments that cannot be left to automatic structuring.
Publisher
John Benjamins Publishing Company
Subject
Library and Information Sciences,Communication,Language and Linguistics
Cited by
8 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献