Titel: A Limburgish Corpus Dictionary: Digital Solutions for the Lexicography of a Non-standardized Regional Language
Personen:Michielsen-Tallman, Yuri/Lugli, Ligeia/Schuler, Michael
Jahr: 2017
Typ: Aufsatz
Verlag: Lexical Computing CZ s.r.o.
Ortsangabe: Brno, Czech Republic
In: Kosem, Iztok/Tiberius, Carole/Jakubíček, Miloš/Kallas, Jelena/Krek, Simon/Baisa, Vít (Hgg.): Electronic lexicography in the 21st century. Lexicography from scratch. Proceedings of eLex 2017 conference, 19 - 21 September 2017, Leiden, the Netherlands
Seiten: 355-376
Untersuchte Sprachen: Englisch*English - Niederländisch*Dutch
Schlagwörter: Internet-Lexikografie/Online-Lexikografie*internet lexicography/online lexicography
korpusbasierte Lexikografie*corpus-based lexicography
Lemmatisierung*lemmatisation
Mikrostruktur*microstructure
Orthografie im Wörterbuch*orthography/spelling information in dictionaries
Medium: Online
URI: https://elex.link/elex2017/proceedings-download/
Zuletzt besucht: 22.10.2018
Abstract: This paper presents the Limburgish Corpus Dictionary (LCD), a newly-started project at Maastricht University that aims to create an online corpus and dictionary of Limburgish from scratch. Limburgish comprises a set of West Germanic dialects spoken in the Dutch and Belgian provinces of Limburg. Due to a variety of factors, including its history and geographic spread, Limburgish exhibits an extremely high degree of spelling variation. In conformity with current policies, our dictionary strives to give equal visibility to all local dialects and variant spellings, with a view to enabling users to search for and retrieve lexical entries using their preferred spelling of a lemma. After a brief outline of the Limburgish language, the history of writing in Limburgish, and Limburgish lexicography, this paper presents the dynamic and multi-layered entry structure that we have devised to represent information about spelling variation. Subsequently, it discusses how our lexicographic model impacts the way we prepare our corpus for analysis. It concludes with a description of our tentative corpus-processing pipeline and the results of some initial NLP software testing.