Home > Publications
Home University of Twente
Prospective Students
Intranet (internal)

EEMCS EPrints Service

23556 Learning to extract folktale keywords
Home Policy Brochure Browse Search User Area Contact Help

Trieschnigg, R.B. and Nguyen, Dong-Phuong and Theune, M. (2013) Learning to extract folktale keywords. In: Proceedings of the 7th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH 2013), 8 Aug 2013, Sofia, Bulgaria. pp. 65-73. The Association for Computational Linguistics. ISBN 978-1-937284-62-6

Full text available as:


1511 Kb
Open Access

Official URL:

Exported to Metis


Manually assigned keywords provide a valuable means for accessing large document collections. They can serve as a shallow document summary and enable more efficient retrieval and aggregation of information. In this paper we investigate keywords in the context of the Dutch Folktale Database, a large collection of stories including fairy tales, jokes and urban legends. We carry out a quantitative and qualitative analysis of the keywords in the collection. Up to 80% of the assigned keywords (or a minor variation) appear in the text itself. Human annotators show moderate to substantial agreement in their judgment of keywords. Finally, we evaluate a learning to rank approach to extract and rank keyword candidates. We conclude that this is a promising approach to automate this time intensive task.

Item Type:Conference or Workshop Paper (Full Paper, Talk)
Research Group:EWI-HMI: Human Media Interaction
Research Program:CTIT-NICE: Natural Interaction in Computer-mediated Environments
Research Project:FACT: Folktales As Classifiable Texts
ID Code:23556
Deposited On:09 September 2013
More Information:statisticsmetis

Export this item as:

To correct this item please ask your editor

Repository Staff Only: edit this item