Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revision | Next revisionBoth sides next revision | ||
en:pojmy:lemma [2022/04/13 14:51] – jankrivan | en:pojmy:lemma [2022/04/13 14:53] – [Sublemma] jankrivan | ||
---|---|---|---|
Line 14: | Line 14: | ||
Starting with the SYN2020 corpus, lemmatization in Czech corpora is two-tiered: each form is given a sublemma attribute in addition to the lemma attribute. While a lemma can associate multiple variants of a single word (e.g. the lemma //filozof// represents all forms with both //filozof// and //filosof// stems), sublemmata delimit subgroups of forms according to this alternation (the sublemma //filozof// represents only forms with the stem // | Starting with the SYN2020 corpus, lemmatization in Czech corpora is two-tiered: each form is given a sublemma attribute in addition to the lemma attribute. While a lemma can associate multiple variants of a single word (e.g. the lemma //filozof// represents all forms with both //filozof// and //filosof// stems), sublemmata delimit subgroups of forms according to this alternation (the sublemma //filozof// represents only forms with the stem // | ||
- | Different types of variants are handled as sublemmata (e.g. // | + | Different types of variants are handled as sublemmata (e.g. // |
===== The link between a lemma and lexeme ===== | ===== The link between a lemma and lexeme ===== |