Titel: Evaluation of Dictionary Creating Methods for Finno-Ugric Minority Languages
Personen:Ferenczi, Zsanett/Mittelholcz, Iván/Váradi, Tamás
Jahr: 2018
Typ: Aufsatz
Verlag: European Language Resources Association (ELRA)
Ortsangabe: Miyazaki
In: Calzolari, Nicoletta/Choukri, Khalid/Cieri, Christopher/Declerck, Thierry/Goggi, Sara/Hasida, Koiti/Isahara, Hitoshi/Maegaard, Bente/Mariani, Joseph/Mazo, Hélène/Moreno, Asunción/Odijk, Jan/Piperidis, Stelios/Tokunaga, Takenobu (Hgg.): Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, 7 - 12 May 2018
Seiten: 1989-1994
Untersuchte Sprachen: Latein*Latin - Minderheitensprachen*Minority Languages - Russisch*Russian - Ungarisch*Hungarian
Schlagwörter: begrenzte Sprachressourcen*under-resourced languages/low-resource languages
lexikografischer Prozess*lexicographic process
zweisprachige bzw. mehrsprachige Lexikografie*bilingual/multilingual lexicography
Medium: Online
URI: http://www.lrec-conf.org/proceedings/lrec2018/papers.html
Zuletzt besucht: 19.11.2018
Abstract: In this paper, we present the evaluation of several bilingual dictionary building methods applied to {Komi-Permyak, Komi-Zyrian, Hill Mari, Meadow Mari, Northern Saami, Udmurt}–{English, Finnish, Hungarian, Russian}language pairs. Since these Finno-Ugric minority languages are under-resourced and standard dictionary building methods require a large amount of pre-processed data, we had to find alternative methods. In a thorough evaluation, we compare the results for each method, which proved our expectations that the precision of standard lexicon building methods is quite low for under-resourced languages. However, utilizing Wikipedia title pairs extracted via inter-language links and Wiktionary-based methods provided useful results. The newly created word pairs enriched with several linguistic information are to be deployed on the web in the framework of Wiktionary. With our dictionaries, the number of Wiktionary entries in the above mentioned Finno-Ugric minority languages can be multiplied.