Titel: Discovering Automated Lexicography: The Case of the Slovene Lexical Database
Personen:Gantar, Polona/Kosem, Iztok/Krek, Simon
Jahr: 2016
Typ: Aufsatz
Periodikum: International Journal of Lexicography
Seiten: 200-225
Band: 29
Heft: 2
Untersuchte Sprachen: Slowenisch*Slovenian
Schlagwörter: Datenbank*data base
Datenmodellierung*data modelling
korpusbasierte Lexikografie*corpus-based lexicography
lexikografischer Prozess*lexicographic process
Redaktionssystem*lexicographic editor
Abstract: In this paper, we describe the compilation of the Slovene Lexical Database; main focus being on developing the methodology to improve the tools used for lexicographic analysis and to introduce automatic data extraction in the lexicographic process. The semi-automated approach, which was devised in the last stages of database compilation, involved extracting corpus data, i.e. grammatical relations, collocations, examples, and grammatical labels, and conducting lexicographic analysis in the dictionary-writing system rather than in the corpus tool. An evaluation that compared the manual approach with the semi-automatic approach showed that the semi-automatic approach is much quicker and presents the lexicographers with almost all the information they identified as relevant during the manual analysis, as well as additional potentially relevant information for the dictionary entry. The final section of the paper proposes a few avenues for improvement of the semi-automated approach, including the implementation of crowdsourcing and additional post-processing of automatically extracted data.