AplikaceAplikace
Nastavení

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
en:cnk:lemtag_mluv [2017/07/18 15:02] – [Related links] michalkrenen:cnk:lemtag_mluv [2017/07/18 15:12] (current) – [Lemmatization and tagging in spoken corpora] michalkren
Line 14: Line 14:
 **Tagging method** **Tagging method**
  
-[[en:seznamy:tagy#pozice_1_-_slovni_druh|The morphological tagging system]] is the same as for written corpora, however, some tags for associated categories are retained (e.g. X for any gender, Y for masculine animate or inanimate etc.) just as they are contained in the morphological dictionary MorfFlex CZ (Hajič–Hlaváčová, 2013). This dictionary was manually and semiautomatically supplemented by frequently unrecognised forms (e.g. dialectal suffixes, forms with varying quantity, prothetic v). The stochastic tagging system MorphoDiTa (Straka a kol., 2014) was used for the tagging itself.+[[seznamy:tagy#pozice_1_-_slovni_druh|The morphological tagging system]] (the description is in Czech only) is the same as for written corpora, however, some tags for associated categories are retained (e.g. X for any gender, Y for masculine animate or inanimate etc.) just as they are contained in the morphological dictionary MorfFlex CZ (Hajič–Hlaváčová, 2013). This dictionary was manually and semiautomatically supplemented by frequently unrecognised forms (e.g. dialectal suffixes, forms with varying quantity, prothetic v). The stochastic tagging system MorphoDiTa (Straka a kol., 2014) was used for the tagging itself.
  
 ===== Modifications to the morphological dictionary ===== ===== Modifications to the morphological dictionary =====
Line 51: Line 51:
 ===== Tag forms===== ===== Tag forms=====
  
-The form of the tags corresponds to that of the [[en:seznamy:tagy#pozice_1_-_slovni_druh|morphological tags]] used in the [[en:cnk:syn|SYN]] series written corpora before the simplification of the tagging system and does not include aspect in the 16th position.+The form of the tags corresponds to that of the [[seznamy:tagy#pozice_1_-_slovni_druh|morphological tags]] (Czech only) used in the [[en:cnk:syn|SYN]] series written corpora before the simplification of the tagging system and does not include aspect in the 16th position.
 Apart from these tags, the first position for the word class and the POS attribute can have the following values: Apart from these tags, the first position for the word class and the POS attribute can have the following values: