Titel: Evaluating a 12-million-Word Corpus as a Source of Dictionary Data
Personen:Wójtowicz, Beata
Jahr: 2018
Typ: Aufsatz
Periodikum: International Journal of Lexicography
Seiten: 327-341
Band: 31
Heft: 3
Untersuchte Sprachen: Afrikanische Sprachen*African Languages - Englisch*English - Polnisch*Polish
Schlagwörter: Frequenz*frequency
Internet-Lexikografie/Online-Lexikografie*internet lexicography/online lexicography
korpusbasierte Lexikografie*corpus-based lexicography
Lemmatisierung*lemmatisation
Abstract: In this paper, we aim to evaluate the 12-million-word Helsinki Corpus of Swahili as a source of dictionary data used, among others, for the creation of the lemma list for a new Swahili-Polish dictionary. We analyse the dictionary log-files in order to answer a question already asked by De Schryver et al. (2006), Koplenig et al. (2014) and Trap-Jensen (2014) about whether dictionary users actually look up frequent words. However, the issue of utmost importance to us is whether a ten-thousand-item frequency list derived from a 12-million-word corpus meets the needs of a Swahili-Polish dictionary user.