In:
Terminology, John Benjamins Publishing Company, Vol. 6, No. 2 ( 2000-12-31), p. 211-232
Abstract:
This article describes a method for extracting terms that combines term frequency with a novel measure of term representativeness (i.e., informativeness or domain specificity). The measure is defined as the normalized distance between the word distribution in the documents which contain the term and the word distribution in the whole corpus. The measure is particularly effective in discarding uninformative terms that frequently appear and has a well-defined threshold value for judging the representativeness of a term. We combined the new measure with term frequency and applied it to the extraction of terms from abstracts of artificial intelligence papers. This article introduces the measure and reports on its effectiveness in term extraction.
Type of Medium:
Online Resource
ISSN:
0929-9971
,
1569-9994
DOI:
10.1075/term.6.2.06his
Language:
English
Publisher:
John Benjamins Publishing Company
Publication Date:
2000
detail.hit.zdb_id:
2053905-8
SSG:
24,1
SSG:
24
SSG:
7,11
Permalink