Changes

Jump to: navigation, search

2014 Prepared Talk Proposals

7 bytes removed, 12:37, 8 November 2013
Lemmatizer
For the production of lemmas we have implemented an advanced workflow. The (generally non-technical) users create lemmas using MS Word, which is both familiar and easy to use. We have developed a custom software module that carefully migrates the Word documents into deeply structured XML by analyzing the structure and semantics of the lemmas, and falling back on heuristics in ambiguous cases. While having initially envisioned the oXygen XML Author component as the main tool for creating new lemmas, we obtained excellent results with the migrator module, and decided therefore to continue using MS Word as the primary composition tool. The main advantage of this is that the editors are much more familiar with Word than with any other WYSIWYG editor. Lemmas that have been migrated to XML are stored in an XML database and can be further edited using oXygen XML Author.
====Lemmatizer==== 
Greek morphology is complicated. In order to use a dictionary effectively, a rather high level of initial language competence is necessary for the user to be able to relate the word form s/he finds in a text to the correct basic lemma form, where the definition of the word can be found. Using a Greek morphological database we have been able to facilitate the search for lemmas. A ‘lemmatizer’ module gives the possible parsings of the word forms and the lemmas they can be derived from. This enables the user to type in the word as found in the text and be redirected to the correct lemma.

Navigation menu