Titel: |
Compatible Sketch Grammars for Comparable Corpora |
Personen: | Benko, Vladimír |
Jahr: |
2014 |
Typ: |
Aufsatz |
Verlag: |
Institute for Specialised Communication and Multilingualism |
Ortsangabe: |
Bolzano/Bozen |
In: |
Abel, Andrea/Vettori, Chiara/Ralli, Natascia: Proceedings of the 16th EURALEX International Congress: The User in Focus, Bolzano/Bozen, Italien 15 - 19 July 2014 |
Seiten: |
417-430 |
Untersuchte Sprachen: |
Verschiedene*various |
Schlagwörter: |
Grammatik im Wörterbuch*grammar in dictionaries
Internet-Lexikografie/Online-Lexikografie*internet lexicography/online lexicography
korpusbasierte Lexikografie*corpus-based lexicography
zweisprachige bzw. mehrsprachige Lexikografie*bilingual/multilingual lexicography
|
Medium: |
Online |
URI: |
http://euralex.org/category/publications/euralex-2014/ |
Zuletzt besucht: |
22.10.2018 |
Abstract: |
Our paper describes an on-going experiment aimed at creating a family of billion-token web corpora that could to a large extent
deserve the designation "comparable": corpora are of the same size, data gathered by crawling the web at (approximately) the same
time, containing similar web-specific domains, genres and registers, further pre-processed, filtered and deduplicated by the same tools,
morphologically annotated by (possibly) the same tagger and made available via Sketch Engine. To overcome the problem of great
differences in the existing sketch grammars for the respective languages, a set of "compatible" sketch grammars have been written that
will aid contrastive linguistic research and bilingual lexicographic projects. The sketch grammars use a uniform set of rules for all word
categories (parts of speech) and the resulting set of tables is displayed in a fixed order in all languages. |