EEMCS EPrints Service
Trieschnigg, R.B. and Nguyen, Dong-Phuong and Theune, M. (2013) Learning to extract folktale keywords. In: Proceedings of the 7th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH 2013), 8 Aug 2013, Sofia, Bulgaria. pp. 65-73. The Association for Computational Linguistics. ISBN 978-1-937284-62-6
Full text available as:
Official URL: https://aclweb.org/anthology/W/W13/W13-2709.pdf
Manually assigned keywords provide a valuable means for accessing large document collections. They can serve as a shallow document summary and enable more efficient retrieval and aggregation of information. In this paper we investigate keywords in the context of the Dutch Folktale Database, a large collection of stories including fairy tales, jokes and urban legends. We carry out a quantitative and qualitative analysis of the keywords in the collection. Up to 80% of the assigned keywords (or a minor variation) appear in the text itself. Human annotators show moderate to substantial agreement in their judgment of keywords. Finally, we evaluate a learning to rank approach to extract and rank keyword candidates. We conclude that this is a promising approach to automate this time intensive task.
Export this item as:
To correct this item please ask your editor
Repository Staff Only: edit this item