Titel: |
A Limburgish Corpus Dictionary: Digital Solutions for the Lexicography of a Non-standardized Regional Language |
Personen: | Michielsen-Tallman, Yuri/Lugli, Ligeia/Schuler, Michael |
Jahr: |
2017 |
Typ: |
Aufsatz |
Verlag: |
Lexical Computing CZ s.r.o. |
Ortsangabe: |
Brno, Czech Republic |
In: |
Kosem, Iztok/Tiberius, Carole/Jakubíček, Miloš/Kallas, Jelena/Krek, Simon/Baisa, Vít (Hgg.): Electronic lexicography in the 21st century. Lexicography from scratch. Proceedings of eLex 2017 conference, 19 - 21 September 2017, Leiden, the Netherlands |
Seiten: |
355-376 |
Untersuchte Sprachen: |
Englisch*English - Niederländisch*Dutch |
Schlagwörter: |
Internet-Lexikografie/Online-Lexikografie*internet lexicography/online lexicography
korpusbasierte Lexikografie*corpus-based lexicography
Lemmatisierung*lemmatisation
Mikrostruktur*microstructure
Orthografie im Wörterbuch*orthography/spelling information in dictionaries
|
Medium: |
Online |
URI: |
https://elex.link/elex2017/proceedings-download/ |
Zuletzt besucht: |
22.10.2018 |
Abstract: |
This paper presents the Limburgish Corpus Dictionary (LCD), a newly-started project at
Maastricht University that aims to create an online corpus and dictionary of Limburgish
from scratch.
Limburgish comprises a set of West Germanic dialects spoken in the Dutch and Belgian
provinces of Limburg. Due to a variety of factors, including its history and geographic
spread, Limburgish exhibits an extremely high degree of spelling variation. In conformity
with current policies, our dictionary strives to give equal visibility to all local dialects and
variant spellings, with a view to enabling users to search for and retrieve lexical entries using
their preferred spelling of a lemma.
After a brief outline of the Limburgish language, the history of writing in Limburgish, and
Limburgish lexicography, this paper presents the dynamic and multi-layered entry structure
that we have devised to represent information about spelling variation. Subsequently, it
discusses how our lexicographic model impacts the way we prepare our corpus for analysis. It
concludes with a description of our tentative corpus-processing pipeline and the results of
some initial NLP software testing. |