Managing the Acronym/Expansion Identi cation Process for Text-Mining Applications
Received:October 14, 2008  Revised:December 20, 2008  Download PDF
Mathieu Roche,Violaine Prince. Managing the Acronym/Expansion Identi cation Process for Text-Mining Applications. International Journal of Software and Informatics, 2008,2(2):163~179
Hits: 4598
Download times: 2668
Mathieu Roche  Violaine Prince
Abstract:This paper deals with an acronym/de nition extraction approach from textual data (corpora) and the disambiguation of these de nitions (or expansions). Both steps of our global process of acquisition and management of acronyms are precisely described. The first step consists in using markers such as brackets to identify expansion candidates. The alignment of the letters allows to select the acronym/de nition couples. The second step is to de ne the relevant expansion of an acronym in a given context. Our method is based on statistical measurements (Mutual Information, Cubic Mutual Information, Dice Measure) and the results provided by search engines. This paper presents an evaluation of the global process from real data (general and specialized domains).
keywords:Web-mining  text-mining  natural language processing  BioNLP  named entities recognition  acronym  quality measures
View Full Text  View/Add Comment  Download reader



Top Paper  |  FAQ  |  Guest Editors  |  Email Alert  |  Links  |  Copyright  |  Contact Us

© Copyright by Institute of Software, the Chinese Academy of Sciences

京公网安备 11040202500065号