Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revision | Next revisionBoth sides next revision | ||
en:pojmy:lemma [2022/04/13 14:53] – [Sublemma] jankrivan | en:pojmy:lemma [2022/04/13 15:08] – [Sublemma] jankrivan | ||
---|---|---|---|
Line 10: | Line 10: | ||
The lemma as a unit originates from an abstraction of a [[en: | The lemma as a unit originates from an abstraction of a [[en: | ||
- | ====== Sublemma | + | ===== Sublemma ===== |
Starting with the SYN2020 corpus, lemmatization in Czech corpora is two-tiered: each form is given a sublemma attribute in addition to the lemma attribute. While a lemma can associate multiple variants of a single word (e.g. the lemma //filozof// represents all forms with both //filozof// and //filosof// stems), sublemmata delimit subgroups of forms according to this alternation (the sublemma //filozof// represents only forms with the stem // | Starting with the SYN2020 corpus, lemmatization in Czech corpora is two-tiered: each form is given a sublemma attribute in addition to the lemma attribute. While a lemma can associate multiple variants of a single word (e.g. the lemma //filozof// represents all forms with both //filozof// and //filosof// stems), sublemmata delimit subgroups of forms according to this alternation (the sublemma //filozof// represents only forms with the stem // |