Both sides previous revisionPrevious revisionNext revision | Previous revision |
en:cnk:uvod [2024/11/07 14:03] – jankocek | en:cnk:uvod [2024/11/18 15:51] (current) – michalskrabal |
---|
| [[en:cnk:kh-dopisy|KH-DOPISY]] | 500k | ✗ | ✗ | 2017 | corpus of Karel Havlíček's correspondence | | | [[en:cnk:kh-dopisy|KH-DOPISY]] | 500k | ✗ | ✗ | 2017 | corpus of Karel Havlíček's correspondence | |
| [[en:cnk:kh-noviny|KH-NOVINY]] | 1M | ✗ | ✗ | 2021 | corpus of Karel Havlíček's journalism | | | [[en:cnk:kh-noviny|KH-NOVINY]] | 1M | ✗ | ✗ | 2021 | corpus of Karel Havlíček's journalism | |
| | [[en:cnk:klaus|Klaus]] | 1.5M | ✓ | ✓ | 2024 | corpus of Václav Klaus' texts | |
| [[en:cnk:orwell|ORWELL]] | 80k | ✓ | ✓ | 2003 | Orwell's novel [[wp>Nineteen_Eighty-Four|1984]], manually annotated | | | [[en:cnk:orwell|ORWELL]] | 80k | ✓ | ✓ | 2003 | Orwell's novel [[wp>Nineteen_Eighty-Four|1984]], manually annotated | |
| **Specialized corpora** |||||| | | **Specialized corpora** |||||| |
| [[en:cnk:ukwac|ukWaC]] | 1.9G | ✓ | ✓ | 2013 | web corpus of British English | | | [[en:cnk:ukwac|ukWaC]] | 1.9G | ✓ | ✓ | 2013 | web corpus of British English | |
| **Specialized foreign language corpora** |||||| | | **Specialized foreign language corpora** |||||| |
| [[en:cnk:baltischebriefe|baltische_briefe]] | 45 tis. | ✓ | ✓ | 2024 | corpus of newspaper Baltische Briefe | | | [[en:cnk:baltischebriefe|Baltische Briefe]] | 300k | ✓ | ✓ | 2024 | corpus of German historical newspaper Baltische Briefe | |
| [[en:cnk:codit|CODIT]] | 27M | ✗ | ✗ | 2021 | diachronic corpus of Italian covering a period from the 13th century until 1947 | | | [[en:cnk:codit|CODIT]] | 27M | ✗ | ✗ | 2021 | diachronic corpus of Italian covering a period from the 13th century until 1947 | |
| [[en:cnk:dotko|DOTKO]] (version 2) | 15.5M | ✓ | ✗ | 2010 | non-reference corpus of Lower Sorbian | | | [[en:cnk:dotko|DOTKO]] (version 2) | 15.5M | ✓ | ✗ | 2010 | non-reference corpus of Lower Sorbian | |