AplikaceAplikace
Nastavení

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
en:cnk:uvod [2026/01/16 12:27] – [Corpora of the Czech National Corpus project] michalkrenen:cnk:uvod [2026/05/25 16:39] (current) michalkren
Line 7: Line 7:
 ^ corpus ^ size (word count) ^  lemmas  ^ morphological tags ^  released((For versioned corpora (e.g. [[en:cnk:syn|SYN]] or [[en:cnk:intercorp|InterCorp]]), the year when the first version was released is also stated.))  ^ characteristic features ^ ^ corpus ^ size (word count) ^  lemmas  ^ morphological tags ^  released((For versioned corpora (e.g. [[en:cnk:syn|SYN]] or [[en:cnk:intercorp|InterCorp]]), the year when the first version was released is also stated.))  ^ characteristic features ^
 | **General corpora** |||||| | **General corpora** ||||||
-| [[en:cnk:syn|SYN]] ([[en:cnk:syn:verze13|version 13]]) |  5.3G |  ✓  |  ✓  |  2010–2024  | versioned corpus, unification of all the SYN-series synchronic written corpora |+| [[en:cnk:syn|SYN]] ([[en:cnk:syn:verze14|version 14]]) |  5.5G |  ✓  |  ✓  |  2010–2025  | versioned corpus, unification of all the SYN-series synchronic written corpora |
 | [[en:cnk:syn2025|SYN2025]] |  100M |  ✓  |  ✓  |  2025  | reference representative corpus, most of the texts are from 2020--2024 | | [[en:cnk:syn2025|SYN2025]] |  100M |  ✓  |  ✓  |  2025  | reference representative corpus, most of the texts are from 2020--2024 |
 | [[en:cnk:syn2020|SYN2020]] |  100M |  ✓  |  ✓  |  2020  | reference representative corpus, most of the texts are from 2015--2019 | | [[en:cnk:syn2020|SYN2020]] |  100M |  ✓  |  ✓  |  2020  | reference representative corpus, most of the texts are from 2015--2019 |
Line 71: Line 71:
 | **Parallel corpora** |||||| | **Parallel corpora** ||||||
 | [[en:cnk:intercorp|InterCorp]] ([[en:cnk:intercorp:verze16|release 16]], [[en:cnk:intercorp:verze16ud|release 16ud]]) |  5.3G |  (✓)  |  (✓)  |  2008–2024  | versioned parallel corpus for 61 languages | | [[en:cnk:intercorp|InterCorp]] ([[en:cnk:intercorp:verze16|release 16]], [[en:cnk:intercorp:verze16ud|release 16ud]]) |  5.3G |  (✓)  |  (✓)  |  2008–2024  | versioned parallel corpus for 61 languages |
 +| [[en:cnk:romcro|RomCro 2.0]] |  19.4M |  ✓  |  ✓  |  2026  | parallel corpus of Romance languages and Croatian |
 | [[en:cnk:psalm77|Psalm 77]] |  10k |  (✓)  |  (✓)  |  2023  | parallel corpus of 11 versions of Psalm 77 in Romanian, Church Slavonic and Greek | | [[en:cnk:psalm77|Psalm 77]] |  10k |  (✓)  |  (✓)  |  2023  | parallel corpus of 11 versions of Psalm 77 in Romanian, Church Slavonic and Greek |
 | **Comparable corpora** |||||| | **Comparable corpora** ||||||