Titel: The Development of a Network Thesaurus with Morpho-semantic Word Markups
Personen:Orešković, Marko/Čubrilo, Mirko/Essert, Mario
Jahr: 2016
Typ: Aufsatz
Verlag: Ivane Javakhishvili Tbilisi State University
Ortsangabe: Tbilisi
In: Margalitadze, Tinatin/Meladze, George (Hgg.): Proceedings of the 17th EURALEX International Congress: Lexicography and Linguistic Diversity. Tbilisi, Georgia 6 - 10 September 2016
Seiten: 273-279
Untersuchte Sprachen: Kroatisch*Croatian
Schlagwörter: Datenbank*data base
korpusbasierte Lexikografie*corpus-based lexicography
Redaktionssystem*lexicographic editor
semantische Relationen im Wörterbuch*semantic/sense relations in dictionaries
Verlinkung/Verweis*cross-references
Wortschatz*vocabulary
Medium: Online
URI: http://euralex.org/category/publications/euralex-2016/
Zuletzt besucht: 22.10.2018
Abstract: This paper presents a part of the network frame of Croatian linguistics which focuses on a new kind of thesaurus, based on morpho-semantic features of words. Instead of the classic (e.g. MULTEXT-EAST) POS tagging of words for grammatical and some semantic categories (e.g. animate), in this paper every word has its hierarchical T-structure which can hold various data types in its branches (string, integer, link, word list, ordered word list etc.), and in that way words and their various occurrence possibilities in a text can be described even better. Moreover, the known WordNet or other semantic structures (e.g. the Croatian Language Portal, terminology repository or a network encyclopedia) can be presented as T-structure nodes in the same way. During this process each word in the definition of an entry is linked to a lexicon, which results in increasing the semantic connectivity of words by at least one order of magnitude (about ten times more semantic relations). Searching through and browsing such a network dictionary brings a new dimension, and words in the dictionary, beside the paradigmatic, also possess all the syntagmatic properties, because the computer processes their appearance in any utterance or sentence as a series of connected nodes (LOD objects). This provides the possibility of storing all data in triplestore (e.g. on the Virtuoso server).