Skrýt
Nastavení

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
en:cnk:koditex [2018/06/05 11:25]
Petra Poukarová [The Koditex Corpus]
en:cnk:koditex [2018/06/05 11:28] (current)
Petra Poukarová [Chunks]
Line 105: Line 105:
 The majority of texts (accounting for 76% of tokens) included in the corpus are Czech originals (not translations from other languages). The only exceptions are text classes where translated material is common in Czech in general, listed in the table below (the rest of the classes are 100% Czech originals). The majority of texts (accounting for 76% of tokens) included in the corpus are Czech originals (not translations from other languages). The only exceptions are text classes where translated material is common in Czech in general, listed in the table below (the rest of the classes are 100% Czech originals).
  
-^ Class ^ Translations (words) ^ Originals (words) ^ % Translations ​^+^ Class ^ Translations (words) ^ Originals (words) ^ % translations ​^
 | LOV |  210,250 |  30,981 |  87.2% | | LOV |  210,250 |  30,981 |  87.2% |
 | CRM |  202,921 |  37,677 |  84.3% | | CRM |  202,921 |  37,677 |  84.3% |