AplikaceAplikace
Nastavení

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revisionBoth sides next revision
en:cnk:koditex [2018/06/05 11:25] – [The Koditex Corpus] Petra Poukarováen:cnk:koditex [2018/06/05 11:28] – [Chunks] Petra Poukarová
Line 105: Line 105:
 The majority of texts (accounting for 76% of tokens) included in the corpus are Czech originals (not translations from other languages). The only exceptions are text classes where translated material is common in Czech in general, listed in the table below (the rest of the classes are 100% Czech originals). The majority of texts (accounting for 76% of tokens) included in the corpus are Czech originals (not translations from other languages). The only exceptions are text classes where translated material is common in Czech in general, listed in the table below (the rest of the classes are 100% Czech originals).
  
-^ Class ^ Translations (words) ^ Originals (words) ^ % Translations ^+^ Class ^ Translations (words) ^ Originals (words) ^ % translations ^
 | LOV |  210,250 |  30,981 |  87.2% | | LOV |  210,250 |  30,981 |  87.2% |
 | CRM |  202,921 |  37,677 |  84.3% | | CRM |  202,921 |  37,677 |  84.3% |