Titel: |
The People's Web meets Linguistic Knowledge: Automatic Sense Alignment of Wikipedia and WordNet |
Personen: | Niemann, Elisabeth/Gurevych, Iryna |
Jahr: |
2011 |
Typ: |
Aufsatz |
In: |
Bos, Johan/Pulman, Stephen (Hgg.): Proceedings of the Ninth International Conference on Computational Semantics (IWCS), Oxford, UK, 12 - 14 January 2011 |
Seiten: |
205-214 |
Untersuchte Sprachen: |
Englisch*English |
Schlagwörter: |
Datenbank*data base
Datenmodellierung*data modelling
Internet-Lexikografie/Online-Lexikografie*internet lexicography/online lexicography
Nutzerbeteiligung*user contribution
|
Medium: |
Online |
URI: |
https://www.informatik.tu-darmstadt.de/media/ukp/data/fileupload_2/lexical_resources/iwcs2011_EW_cameraready.pdf |
Zuletzt besucht: |
19.10.2020 |
Abstract: |
We propose a method to automatically alignWordNet synsets andWikipedia articles to obtain a sense inventory of higher coverage and quality. For each WordNet synset, we first extract a set of Wikipedia articles as alignment candidates; in a second step, we determine which article (if any) is a valid alignment, i.e. is about the same sense or concept. In this paper, we go significantly beyond state-of-the-art word overlap approaches, and apply a threshold-based Personalized PageRank method for
the disambiguation step. We show that WordNet synsets can be aligned to Wikipedia articles with a performance of up to 0.78 F1-Measure based on a comprehensive, well-balanced reference dataset consisting of 1,815 manually annotated sense alignment candidates. The fully-aligned resource as well as the reference dataset is publicly available. |