Titel: A co-occurrence taxonomy from a general language corpus
Personen:Nazar, Rogelio/Renau, Irene
Jahr: 2012
Typ: Aufsatz
Verlag: Universitetet i Oslo, Institutt for lingvistiske og nordiske studier
Ortsangabe: Oslo
In: Fjeld, Ruth V./Torjusen, Julie M. (Hgg.): Proceedings of the 15th EURALEX International Congress 2012, Oslo, Norway, 7 - 11 August 2012
Seiten: 367-375
Untersuchte Sprachen: Spanisch*Spanish
Schlagwörter: Kookkurrenzanalyse*collocation analysis
korpusbasierte Lexikografie*corpus-based lexicography
semantische Relationen im Wörterbuch*semantic/sense relations in dictionaries
Medium: Online
URI: http://euralex.org/category/publications/euralex-oslo-2012/
Zuletzt besucht: 17.09.2018
Abstract: This paper presents a quantitative approach to the generation of a taxonomy of general language. The methodology is based on statistics of word co-occurrence and it exploits the fact that word association is asymmetrical in nature, in much the same way as hyperonymy relations are. Words tend to be syntagmatically associated with their hyperonyms, though this is not true the other way round. Taking advantage of this phenomenon, and with the help of directed graphs of word co-occurrence, we were able to collect hyperonym-hyponym pairs using a reference corpus of general language as the only source of information, i.e., without using lexico-syntactic patterns nor any kind of pre-existing semantic resources such as dictionaries, ontologies or thesauri. The results obtained by using this method are not precise enough to be used for immediate practical purposes, but they confirm the hypothesis that as a general rule hyperonymy is linked to asymmetric co-occurrence relations. The paper discusses an experiment in Spanish, but we believe the same conclusions apply to other languages as well.