Both sides previous revisionPrevious revisionNext revision | Previous revision |
en:cnk:uvod [2024/08/05 10:34] – [Corpora of the Czech National Corpus project] ORTOFON v3 vhorky | en:cnk:uvod [2024/10/01 10:52] (current) – [Corpora of the Czech National Corpus project] michalkren |
---|
| [[en:cnk:koditex|Koditex]] | 10.8M | ✓ | ✓ | 2018 | corpus for multi-dimensional analysis of Czech registers | | | [[en:cnk:koditex|Koditex]] | 10.8M | ✓ | ✓ | 2018 | corpus for multi-dimensional analysis of Czech registers | |
| [[en:cnk:ksk-dopisy|KSK-DOPISY]] | 800k | ✗ | ✗ | 2006 | transcriptions of handwritten correspondence from 1990--2004 | | | [[en:cnk:ksk-dopisy|KSK-DOPISY]] | 800k | ✗ | ✗ | 2006 | transcriptions of handwritten correspondence from 1990--2004 | |
| [[en:cnk:ksp|KSP]] | 35.5M | ✓ | ✓ | 2022 | corpus of contemporary Czech poetry published in books and on literary servers from 1990--2020 | | | [[en:cnk:ksp|KSP]] (version 2) | 37.5M | ✓ | ✓ | 2022 | corpus of contemporary Czech poetry published in books and on literary servers from 1990--2020 | |
| [[en:cnk:link|LINK]] | 1.8M | ✓ | ✓ | 2010 | non-reference corpus of linguistic texts | | | [[en:cnk:link|LINK]] | 1.8M | ✓ | ✓ | 2010 | non-reference corpus of linguistic texts | |
| [[en:cnk:totalita|Totalita]] | 12,9M | ✓ | ✓ | 2010 | written language of the communist regime | | | [[en:cnk:totalita|Totalita]] | 12,9M | ✓ | ✓ | 2010 | written language of the communist regime | |
^ corpus ^ size (word count) ^ lemmas ^ morphological tags ^ year ^ characteristic features ^ | ^ corpus ^ size (word count) ^ lemmas ^ morphological tags ^ year ^ characteristic features ^ |
| **Parallel corpora** |||||| | | **Parallel corpora** |||||| |
| [[en:cnk:intercorp|InterCorp]] ([[en:cnk:intercorp:verze13ud|release 13ud]], [[en:cnk:intercorp:verze16|release 16]], [[en:cnk:intercorp:verze16ud|release 16ud]]) | 5.3G | (✓) | (✓) | 2008–2023 | versioned parallel corpus for 61 languages | | | [[en:cnk:intercorp|InterCorp]] ([[en:cnk:intercorp:verze16|release 16]], [[en:cnk:intercorp:verze16ud|release 16ud]]) | 5.3G | (✓) | (✓) | 2008–2024 | versioned parallel corpus for 61 languages | |
| [[en:cnk:psalm77|Psalm 77]] | 10k | (✓) | (✓) | 2023 | parallel corpus of 11 versions of Psalm 77 in Romanian, Church Slavonic and Greek | | | [[en:cnk:psalm77|Psalm 77]] | 10k | (✓) | (✓) | 2023 | parallel corpus of 11 versions of Psalm 77 in Romanian, Church Slavonic and Greek | |
| **Comparable corpora** |||||| | | **Comparable corpora** |||||| |