Skrýt
Nastavení

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
en:cnk:lemtag_mluv [2017/07/18 15:12]
Michal Křen [Lemmatization and tagging in spoken corpora]
en:cnk:lemtag_mluv [2017/07/18 15:12] (current)
Michal Křen [Lemmatization and tagging in spoken corpora]
Line 14: Line 14:
 **Tagging method** **Tagging method**
  
-[[seznamy:​tagy#​pozice_1_-_slovni_druh|The morphological tagging system]] is the same as for written corpora, however, some tags for associated categories are retained (e.g. X for any gender, Y for masculine animate or inanimate etc.) just as they are contained in the morphological dictionary MorfFlex CZ (Hajič–Hlaváčová,​ 2013). This dictionary was manually and semiautomatically supplemented by frequently unrecognised forms (e.g. dialectal suffixes, forms with varying quantity, prothetic v). The stochastic tagging system MorphoDiTa (Straka a kol., 2014) was used for the tagging itself.+[[seznamy:​tagy#​pozice_1_-_slovni_druh|The morphological tagging system]] ​(the description is in Czech only) is the same as for written corpora, however, some tags for associated categories are retained (e.g. X for any gender, Y for masculine animate or inanimate etc.) just as they are contained in the morphological dictionary MorfFlex CZ (Hajič–Hlaváčová,​ 2013). This dictionary was manually and semiautomatically supplemented by frequently unrecognised forms (e.g. dialectal suffixes, forms with varying quantity, prothetic v). The stochastic tagging system MorphoDiTa (Straka a kol., 2014) was used for the tagging itself.
  
 ===== Modifications to the morphological dictionary ===== ===== Modifications to the morphological dictionary =====