Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revision | |
| en:cnk:orator [2026/01/21 15:32] – [ORATOR v3 (2025)] michalkren | en:cnk:orator [2026/01/23 11:48] (current) – [Morphological tagging of the ORATOR corpus] krivan |
|---|
| ===== Morphological tagging of the ORATOR corpus ===== | ===== Morphological tagging of the ORATOR corpus ===== |
| |
| The ORATOR v3 corpus is automatically [[en:pojmy:tag|annotated]] with [[en:cnk:syn2020#morphological_tagging|a new morphological tag]] according to the SYN2020 standard. It recognizes [[en:cnk:syn2020#multiple_lemmatization_and_tagging_aggregate|aggregates]] (e.g., //vidělas//, //zač//), uses [[en:cnk:syn2020|double-level lemmatization]], and has a verb tag ([[en:cnk:syn2020#verb_tagging_verbtag|verbtag]]). | The ORATOR v3 corpus is automatically [[en:pojmy:tag|annotated]] with [[en:cnk:syn2020#morphological_tagging|a new morphological tag]] according to the [[en:cnk:anotacni_standard_cnk|unified CNC annotation scheme]]. It recognizes [[en:cnk:syn2020#multiple_lemmatization_and_tagging_aggregate|aggregates]] (e.g., //vidělas//, //zač//), uses [[en:cnk:syn2020|double-level lemmatization]], and has a verb tag ([[en:cnk:syn2020#verb_tagging_verbtag|verbtag]]). |
| |
| Substandard variants and forms typical of dialects and spontaneous speech are also tagged in the corpus (according to the ORTOFON corpus, see [[en:cnk:ortofon#morphological_tagging_of_the_ortofon_corpus|Morphological tagging of the ORTOFON corpus]]). | Substandard variants and forms typical of dialects and spontaneous speech are also tagged in the corpus (according to the ORTOFON corpus, see [[en:cnk:ortofon#morphological_tagging_of_the_ortofon_corpus|Morphological tagging of the ORTOFON corpus]]). |