EEMCS

Home > Publications
Home University of Twente
Education
Research
Prospective Students
Jobs
Publications
Intranet (internal)
 
 Nederlands
 Contact
 Sitemap
 Search
 Organisation

EEMCS EPrints Service


9488 Multilingual domain modeling in Twenty-One: automatic creation of a bi-directional translation lexicon from a parallel corpus
Home Policy Brochure Browse Search User Area Contact Help

Hiemstra, D. (1998) Multilingual domain modeling in Twenty-One: automatic creation of a bi-directional translation lexicon from a parallel corpus. In: Computational Linguistics in the Netherlands, Eighth CLIN Meeting 1997, Dec 1997, Nijmegen. pp. 41-58. Language and computers 25. Rodopi. ISBN 90-420-0504-1

Full text available as:

PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
80 Kb

Abstract

Within the project Twenty-One, which aims at the effective dissemination of information on ecology and sustainable development, a sytem is developed that supports cross-language information retrieval in any of the four languages Dutch, English, French and German. Knowledge of this application domain is needed to enhance existing translation resources for the purpose of lexical disambiguation. This paper describes an algorithm for the automated acquisition of a translation lexicon from a parallel corpus. New about the presented algorithm is the statistical language model used. Because the algorithm is based on a symmetric translation model it becomes possible to identify one-to-many and many-to-one relations between words of a language pair. We claim that the presented method has two advantages over algorithms that have been published before. Firstly, because the translation model is more powerful, the resulting bilingual lexicon will be more accurate. Secondly, the resulting bilingual lexicon can be used to translate in both directions between a language pair. Different versions of the algorithm were evaluated on the Dutch and English version of the Agenda 21 corpus, which is a UN document on the application domain of sustainable development.

Item Type:Conference or Workshop Paper (Full Paper, Talk)
Research Group:EWI-HMI: Human Media Interaction
Research Project:Twenty-One: Telematics Applications Programme
ID Code:9488
Status:Published
Deposited On:26 February 2007
Refereed:Yes
International:No
More Information:statistics

Export this item as:

To correct this item please ask your editor

Repository Staff Only: edit this item