| Both sides previous revisionPrevious revisionNext revision | Previous revision |
| en:cnk:syn:verze10 [2022/06/09 13:36] – jankrivan | en:cnk:syn:verze10 [2026/01/23 11:51] (current) – [Structure and annotation of SYN version 10] krivan |
|---|
| |
| <WRAP right 35%> | <WRAP right 35%> |
| ^ <fs medium>Name</fs> ^^ <fs medium>SYN version 9</fs> ^ | ^ <fs medium>Name</fs> ^^ <fs medium>SYN version 10</fs> ^ |
| ^ [[pojmy:atributy_pozicni|Position]] ^ Number of tokens | 5 887 514 992 | | ^ [[pojmy:atributy_pozicni|Position]] ^ Number of tokens | 5 887 514 992 | |
| ^ ::: ^ Number of tokens without punctuation | 4 881 700 519 | | ^ ::: ^ Number of tokens without punctuation | 4 881 700 519 | |
| ====== Structure and annotation of SYN version 10 ====== | ====== Structure and annotation of SYN version 10 ====== |
| |
| Generally speaking, structure and annotation of SYN version 10 are based on that of the SYN2020 corpus. In particular, hierarchy of structural tags for SYN version 10 has been taken over from SYN2020, as well as the [[en:cnk:syn2020#annotation_of_syn2020changes_compared_to_other_corpora_of_the_syn_series|lemmatization and morphological tagging]]. SYN version 10 shares all these aspects with its predecessor, [[en:cnk:syn:verze9|SYN version 9]]. | Generally speaking, structure and annotation of SYN version 10 are based on that of the SYN2020 corpus. Hierarchy of structural tags for SYN version 10 has been taken over from SYN2020. Morphological tagging, lemmatization, and tokenization of the corpus are performed fully automatically according to the [[en:cnk:anotacni_standard_cnk|unified CNC annotation scheme]]. SYN version 10 shares all these aspects with its predecessor, [[en:cnk:syn:verze9|SYN version 9]]. |
| |
| This correspondence of structure and annotation between SYN version 10 and [[en:cnk:syn2020|SYN2020]] only has the following exceptions: | This correspondence of structure and annotation between SYN version 10 and [[en:cnk:syn2020|SYN2020]] only has the following exceptions: |
| |
| <WRAP round tip 70%> | <WRAP round tip 70%> |
| Křen, M. – Cvrček, V. – Henyš, J. – Hnátková, M. – Jelínek, T. – Kocek, J. – Kováříková, D. – Křivan, J. – Milička, J. – Petkevič, V. – Procházka, P. – Skoumalová, H. – Šindlerová, J. – Škrabal, M.: //Corpus SYN, version 10 from 22. 2. 2022//. Ústav Českého národního korpusu FF UK, Praha 2022. Available online: https://www.korpus.cz. | Křen, M. – Cvrček, V. – Hnátková, M. – Jelínek, T. – Kocek, J. – Kováříková, D. – Křivan, J. – Milička, J. – Petkevič, V. – Procházka, P. – Skoumalová, H. – Šindlerová, J. – Škrabal, M.: //Corpus SYN, version 10 from 22. 2. 2022//. Ústav Českého národního korpusu FF UK, Praha 2022. Available online: https://www.korpus.cz. |
| |
| |