Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revision | |
en:cnk:lemtag_mluv [2017/07/18 15:12] – [Lemmatization and tagging in spoken corpora] michalkren | en:cnk:lemtag_mluv [2017/07/18 15:12] (current) – [Lemmatization and tagging in spoken corpora] michalkren |
---|
**Tagging method** | **Tagging method** |
| |
[[seznamy:tagy#pozice_1_-_slovni_druh|The morphological tagging system]] is the same as for written corpora, however, some tags for associated categories are retained (e.g. X for any gender, Y for masculine animate or inanimate etc.) just as they are contained in the morphological dictionary MorfFlex CZ (Hajič–Hlaváčová, 2013). This dictionary was manually and semiautomatically supplemented by frequently unrecognised forms (e.g. dialectal suffixes, forms with varying quantity, prothetic v). The stochastic tagging system MorphoDiTa (Straka a kol., 2014) was used for the tagging itself. | [[seznamy:tagy#pozice_1_-_slovni_druh|The morphological tagging system]] (the description is in Czech only) is the same as for written corpora, however, some tags for associated categories are retained (e.g. X for any gender, Y for masculine animate or inanimate etc.) just as they are contained in the morphological dictionary MorfFlex CZ (Hajič–Hlaváčová, 2013). This dictionary was manually and semiautomatically supplemented by frequently unrecognised forms (e.g. dialectal suffixes, forms with varying quantity, prothetic v). The stochastic tagging system MorphoDiTa (Straka a kol., 2014) was used for the tagging itself. |
| |
===== Modifications to the morphological dictionary ===== | ===== Modifications to the morphological dictionary ===== |