Titel: From Printed Materials to Electronic Demonstrative Dictionary - the Story of the National Photocorpus of Polish and its Korean and Vietnamese Descendants
Personen:Borchmann, Lukasz/Dzienisiewicz, Daniel/Wierzchoń, Piotr
Jahr: 2017
Typ: Aufsatz
Verlag: Lexical Computing CZ s.r.o.
Ortsangabe: Brno, Czech Republic
In: Kosem, Iztok/Tiberius, Carole/Jakubíček, Miloš/Kallas, Jelena/Krek, Simon/Baisa, Vít (Hgg.): Electronic lexicography in the 21st century. Lexicography from scratch. Proceedings of eLex 2017 conference, 19 - 21 September 2017, Leiden, the Netherlands
Seiten: 680-702
Untersuchte Sprachen: Koreanisch*Korean - Polnisch*Polish - Russisch*Russian
Schlagwörter: Datenmodellierung*data modelling
korpusbasierte Lexikografie*corpus-based lexicography
lexikografischer Prozess*lexicographic process
Maschinenlesbarkeit*machine readability
Publikationsform*form of publication
Medium: Online
URI: https://elex.link/elex2017/proceedings-download/
Zuletzt besucht: 22.10.2018
Abstract: The most popular form of lexicographic exemplification is plain-text transcript. Apart from the doubtless advantages of such a quotation method, it may be perceived as a kind of trade-off when considering readability, accessibility, simplicity, accuracy, and even the logistics of a documentation project. Another approach is to gather and present excerpts in the form in which they were originally published, that is, as the clippings from publications (this is referred to as photodocumentation). The photodocumentary technique is a distinctive feature of both the National Photocorpus of Polish and its Korean and Vietnamese descendants. The main goal of the first of the above-mentioned projects was to describe around 250,000 lexical units, which would be enough to outperform all of the 20th-century dictionaries of Polish. Even more momentously, the process was entirely corpus-driven - that is, all of the principial lexicographic works preceding the project were intentionally ignored. As a result, the material contains largely the words of which linguists were unaware of or which were perceived as later neologisms under leading derivative models of Polish. This article describes the projects from their early stages, namely the acquisition of printed materials, to the final level of development where an electronic lexicographic tool is made available to both amateur and professional users. Also described is the struggle to avoid unthinking imitation of p-lexicographic techniques. The methodology had to be adapted to meet modern web usability standards.