AplikaceAplikace
Nastavení

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
en:cnk:syn:verze11 [2022/12/21 12:52] – created michalkrenen:cnk:syn:verze11 [2022/12/21 16:11] (current) vaclavcvrcek
Line 3: Line 3:
  
 <WRAP right 35%> <WRAP right 35%>
-^ <fs medium>Name</fs> ^^ <fs medium>SYN version 9</fs> ^+^ <fs medium>Name</fs> ^^ <fs medium>SYN version 11</fs> ^
 ^ [[pojmy:atributy_pozicni|Position]] ^ Number of tokens |  6 067 313 960 |   ^ [[pojmy:atributy_pozicni|Position]] ^ Number of tokens |  6 067 313 960 |  
-^ ::: ^ Number of tokens without punctuation  |  4 719 008 171 |   +^ ::: ^ Number of tokens without punctuation  |  5 031 922 694 |   
-^ ::: ^ Number of [[en:pojmy:word|word forms]]  |  10 843 867 |   +^ ::: ^ Number of [[en:pojmy:word|word forms]]  |  11 213 982 |   
-^ ::: ^ Number of [[en:pojmy:lemma|lemmas]] |  7 375 002 +^ ::: ^ Number of [[en:pojmy:lemma|lemmas]] |  7 509 752 
-[[en:pojmy:atributy_strukturni|Structures]] ^ Number of [[en:pojmy:doc|documents]] |  124 247 +^ Structures ^ Number of documents |  138 186 
-^ ::: ^ Number of [[en:pojmy:atributy_strukturni|texts]]|  17 687 333 +^ ::: ^ Number of texts |  18 575 347 
-^ ::: ^ Number of sentences |  362 174 692 +^ ::: ^ Number of sentences |  386 045 094 
-^ Other information ^ [[en:pojmy:referencni|Referential]] |  YES |   +^ Other information ^ Referential |  YES |   
-^ ::: ^ [[en:pojmy:reprezentativnost|Representative]] |  NO (predominantly journalism) |  +^ ::: ^ Representative |  NO (predominantly journalism) |  
 ^ ::: ^ Publication year |  2022 | ^ ::: ^ Publication year |  2022 |
 </WRAP> </WRAP>
Line 45: Line 45:
 ====== Structure and annotation of SYN version 11 ====== ====== Structure and annotation of SYN version 11 ======
  
-Generally speaking, structure and annotation of SYN version 11 are based on that of the SYN2020 corpus. In particular, hierarchy of structural tags for SYN version 11 has been taken over from SYN2020, as well as the [[en:cnk:syn2020#annotation_of_syn2020changes_compared_to_other_corpora_of_the_syn_series|lemmatization and morphological tagging]]. In this respect, SYN version 11 is the same as its predecessor, [[en:cnk:syn:verze10|SYN version 10]]**.+Generally speaking, structure and annotation of SYN version 11 are based on that of the SYN2020 corpus. In particular, hierarchy of structural tags for SYN version 11 has been taken over from SYN2020, as well as the [[en:cnk:syn2020#annotation_of_syn2020changes_compared_to_other_corpora_of_the_syn_series|lemmatization and morphological tagging]]. In this respect, SYN version 11 is the same as its predecessor, [[en:cnk:syn:verze10|SYN version 10]].
  
 The correspondence of structure and annotation between SYN version 11 and [[en:cnk:syn2020|SYN2020]] only has the following exceptions: The correspondence of structure and annotation between SYN version 11 and [[en:cnk:syn2020|SYN2020]] only has the following exceptions:
- 
   * introducing the additional attribute ''<doc syn>'' for the [[en:cnk:syn#reference_corpora_as_subcorpora_in_syn|creation of subcorpora corresponding to the original reference corpora]];   * introducing the additional attribute ''<doc syn>'' for the [[en:cnk:syn#reference_corpora_as_subcorpora_in_syn|creation of subcorpora corresponding to the original reference corpora]];
   * replacing [[en:pojmy:syntakticka_analyza|syntactic annotation]] in the SYN2020 corpus with a pilot version of **[[en:seznamy:frazemy|phraseme annotation]]**.   * replacing [[en:pojmy:syntakticka_analyza|syntactic annotation]] in the SYN2020 corpus with a pilot version of **[[en:seznamy:frazemy|phraseme annotation]]**.