Titel: The People's Web meets Linguistic Knowledge: Automatic Sense Alignment of Wikipedia and WordNet
Personen:Niemann, Elisabeth/Gurevych, Iryna
Jahr: 2011
Typ: Aufsatz
In: Bos, Johan/Pulman, Stephen (Hgg.): Proceedings of the Ninth International Conference on Computational Semantics (IWCS), Oxford, UK, 12 - 14 January 2011
Seiten: 205-214
Untersuchte Sprachen: Englisch*English
Schlagwörter: Datenbank*data base
Datenmodellierung*data modelling
Internet-Lexikografie/Online-Lexikografie*internet lexicography/online lexicography
Nutzerbeteiligung*user contribution
Medium: Online
URI: https://www.informatik.tu-darmstadt.de/media/ukp/data/fileupload_2/lexical_resources/iwcs2011_EW_cameraready.pdf
Zuletzt besucht: 19.10.2020
Abstract: We propose a method to automatically alignWordNet synsets andWikipedia articles to obtain a sense inventory of higher coverage and quality. For each WordNet synset, we first extract a set of Wikipedia articles as alignment candidates; in a second step, we determine which article (if any) is a valid alignment, i.e. is about the same sense or concept. In this paper, we go significantly beyond state-of-the-art word overlap approaches, and apply a threshold-based Personalized PageRank method for the disambiguation step. We show that WordNet synsets can be aligned to Wikipedia articles with a performance of up to 0.78 F1-Measure based on a comprehensive, well-balanced reference dataset consisting of 1,815 manually annotated sense alignment candidates. The fully-aligned resource as well as the reference dataset is publicly available.