AplikaceAplikace
Nastavení

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
en:pojmy:tag [2016/12/08 14:53] veronikapojarovaen:pojmy:tag [2026/01/16 12:04] (current) – [Morphological tags] krivan
Line 3: Line 3:
 A morphological tag (commonly called **tag**) is a summary of the grammatical information about a specific word ([[en:pojmy:pozice|position ]]) in the given context. A tag is usually automatically generated based on a [[en:pojmy:morfologicka_analyza|morphological analysis]] and a subsequent  [[en:pojmy:desambiguace|disambiguation]]. A morphological tag (commonly called **tag**) is a summary of the grammatical information about a specific word ([[en:pojmy:pozice|position ]]) in the given context. A tag is usually automatically generated based on a [[en:pojmy:morfologicka_analyza|morphological analysis]] and a subsequent  [[en:pojmy:desambiguace|disambiguation]].
  
-Tags are [[en:pojmy:atributy_pozicni|positional attributes]]. A morphological tag in the Czech CNC corpora consists of a sequence of symbols (letters and numbers) which have a specific meaning based on the position which they occupy in the code. In the Czech sentence //Po promoci na londýnské universitě odjel jsem roku 1878 do Netley na školení vojenských chirurgů.// the word form //promoci// (although this form is potentially morphologically ambiguous) has a morphological tag ''NNFS6-----A-----'', which indicates that it is a:+Tags are [[en:pojmy:atributy_pozicni|positional attributes]]. A morphological tag in the Czech CNC corpora consists of a sequence of symbols (letters and numbers) which have a specific meaning based on the position which they occupy in the code. In the Czech sentence //Po promoci na londýnské universitě odjel jsem roku 1878 do Netley na školení vojenských chirurgů.// the word form //promoci// (although this form is potentially morphologically ambiguous) has a morphological tag ''<nowiki>NNFS6-----A----</nowiki>'', which indicates that it is a:
   * noun (=N)   * noun (=N)
   * common noun (=N)   * common noun (=N)
Line 12: Line 12:
 ===== Tagset ===== ===== Tagset =====
  
-A set of rules and values which can occur in a tag is called a tagset. The positional [[en:seznamy:tagy#popis_jednotlivych_pozic_znacky|tagset used in the Czech CNC corpora]] has 16 positions, each of which carries some information about a specific grammatical category:+A set of rules and values which can occur in a tag is called a tagset. The positional [[en:seznamy:tagy#popis_jednotlivych_pozic_znacky|tagset used in the Czech CNC corpora]] has 15 positions (starting from SYN2020), each of which carries some information about a specific grammatical category:
  
   - Word class   - Word class
Line 19: Line 19:
   - Number   - Number
   - Case   - Case
-  - Possessive case+  - Possessive gender
   - Possessive number   - Possessive number
   - Person   - Person
Line 26: Line 26:
   - Negation   - Negation
   - Active/passive   - Active/passive
-  - //not used// 
-  - //not used// 
-  - Variant, stylistic marking etc.. 
   - Aspect   - Aspect
 +  - //not used//
 +  - Variant (stylistic marking etc...)
 +
 +Previously, a modified tagset with 16 positions was used (with Position 13 not used and Position 16 marking Aspect).
 +
 +===== Tagsets used in the parallel corpus InterCorp =====
 +There are different tagsets for various languages. Description of these tagsets can be found [[en:cnk:intercorp:verze10#morphosyntactic_annotation|here]]. Some recent versions of the InterCorp parallel corpus have been annotated in terms of morphological categories, syntactic functions and syntactic structure following the [[en:pojmy:ud|UD guidelines]].
 +
  
 ==== Relevant links ==== ==== Relevant links ====