Abstract
The corpus-based identification of those lexical units which serve to describe a given specialized domain usually becomes a complex task, where an analysis oriented to the frequency of words and the likelihood of lexical associations is often ineffective. The goal of this article is to demonstrate that a user-adjustable composite metric such as SRC can accommodate to the diversity of domain-specific glossaries to be constructed from small- and medium-sized specialized corpora of non-structured texts. Unlike for most of the research in automatic term extraction, where single metrics are usually combined indiscriminately to produce the best results, SRC is grounded on the theoretical principles of salience, relevance and cohesion, which have been rationally implemented in the three components of this metric.
Publisher
John Benjamins Publishing Company
Subject
Library and Information Sciences,Communication,Language and Linguistics
Reference56 articles.
1. Weirdness Indexing for Logical Document Extrapolation and Retrieval (WILDER);Ahmad,2000
2. An Improved Automatic Term Recognition Method for Spanish
3. Word Association Norms, Mutual Information and Lexicography;Church;Computational Linguistics,1990
4. Using Statistics in Lexical Analysis;Church,1991
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献