Titel: |
Candidate Ranking for Maintenance of an Online Dictionary |
Personen: | Broad, Claire/Langone, Helen/Brizan, David G. |
Jahr: |
2018 |
Typ: |
Aufsatz |
Verlag: |
European Language Resources Association (ELRA) |
Ortsangabe: |
Miyazaki |
In: |
Calzolari, Nicoletta/Choukri, Khalid/Cieri, Christopher/Declerck, Thierry/Goggi, Sara/Hasida, Koiti/Isahara, Hitoshi/Maegaard, Bente/Mariani, Joseph/Mazo, Hélène/Moreno, Asunción/Odijk, Jan/Piperidis, Stelios/Tokunaga, Takenobu (Hgg.): Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, 7 - 12 May 2018 |
Seiten: |
839-843 |
Untersuchte Sprachen: |
Englisch*English |
Schlagwörter: |
automatische Sprachverarbeitung*automatic speech processing
Neologismen*neologisms
Orthografie im Wörterbuch*orthography/spelling information in dictionaries
|
Medium: |
Online |
URI: |
http://www.lrec-conf.org/proceedings/lrec2018/papers.html |
Zuletzt besucht: |
19.11.2018 |
Abstract: |
Traditionally, the process whereby a lexicographer identifies a lexical item to add to a dictionary -- a database of lexical items -- has been time-consuming and subjective. In the modern age of online dictionaries, all queries for lexical entries not currently in the database are indistinguishable from a larger list of misspellings, meaning that potential new or trending entries can get lost easily. In this project, we develop a system that uses machine learning techniques to assign these "misspells" a probability of being a novel or missing entry, incorporating signals from orthography, usage by trusted online sources, and dictionary query patterns. |