Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision |
en:pojmy:lemma [2016/12/09 21:43] – [Problems with lemmatization] veronikapojarova | en:pojmy:lemma [2016/12/09 21:50] – [The lemmatization process] veronikapojarova |
---|
==== Problems with lemmatization ==== | ==== Problems with lemmatization ==== |
| |
One of the biggest linguistic and computational problems is the lemmatization of multiword expressions. Another problem of automatic lemmatization which remains unsolved is the lemmatization of all forms under one lemma even in cases where it is not appropriate e.g. //Cheers!//, when no registered meaning of the word //cheeer// corresponds with the pragmatic meaning, because it does not fall under strictly morphological lemmatization. | One of the biggest linguistic and computational problems is the lemmatization of multiword expressions. Another problem of automatic lemmatization which remains unsolved is the lemmatization of all forms under one lemma even in cases where it is not appropriate e.g. //Cheers!//, when no registered meaning of the word //cheer// corresponds with the pragmatic meaning, because it does not fall under strictly morphological lemmatization. |
| |
==== The lemmatization process ==== | ==== The lemmatization process ==== |
| |
Automatickou lemmatizaci provádí počítačový program zvaný //lemmatizátor//, který bývá součástí morfologického [[pojmy:tag|taggeru]], provádějícího morfologickou [[pojmy:desambiguace|desambiguaci]] textu. Smyslem lemmatizace je jednak identifikovat v daném kontextu náležitý lexém u homonymních slovních tvarů, jednak umožnit uživateli pracovat nikoli jen se slovními tvary, nýbrž i s lemmaty jakožto reprezentanty příslušných lexémů a jejich paradigmat, což mu podstatně usnadňuje práci s korpusem. | Automatic lemmatization is done by a computer program called a //lemmatizátor//, which is often part of a morphological [[en:pojmy:tag|tagger]] carrying out the [[en:pojmy:desambiguace|disambiguation]] of the text. The purpose of lemmatization is firstly to identify in a given context the appropriate lexeme among homonymous word forms, and secondly to enable the user to work not only with word forms, but also lemmas as representations of the given lexemes and their paradigms, all of which facilitates work with the corpus. |
| |
==== Related links ==== | ==== Related links ==== |