Titel: |
Extracting an Etymological Database from Wiktionary |
Personen: | Sagot, Benoît |
Jahr: |
2017 |
Typ: |
Aufsatz |
Verlag: |
Lexical Computing CZ s.r.o. |
Ortsangabe: |
Brno, Czech Republic |
In: |
Kosem, Iztok/Tiberius, Carole/Jakubíček, Miloš/Kallas, Jelena/Krek, Simon/Baisa, Vít (Hgg.): Electronic lexicography in the 21st century. Lexicography from scratch. Proceedings of eLex 2017 conference, 19 - 21 September 2017, Leiden, the Netherlands |
Seiten: |
716-728 |
Untersuchte Sprachen: |
Englisch*English - Französisch*French |
Schlagwörter: |
Datenbank*data base
Datenmodellierung*data modelling
Etymologie im Wörterbuch*etymology in dictionaries
historische Lexikografie*historical lexicography
XML/SGML*XML/SGML
|
Medium: |
Online |
URI: |
https://elex.link/elex2017/proceedings-download/ |
Zuletzt besucht: |
22.10.2018 |
Abstract: |
Electronic lexical resources almost never contain etymological information. The availability of such information,
if properly formalised, could open up the possibility of developing automatic tools targeted towards historical
and comparative linguistics, as well as significantly improving the automatic processing of ancient languages. We
describe here the process we implemented for extracting etymological data from the etymological notices found
in Wiktionary. We have produced a multilingual database of nearly one million lexemes and a database of more
than half a million etymological relations between lexemes. |