AplikaceAplikace
Nastavení

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revisionBoth sides next revision
en:cnk:koditex [2018/06/05 11:23] – [The Koditex Corpus] Petra Poukarováen:cnk:koditex [2018/06/05 11:25] – [The Koditex Corpus] Petra Poukarová
Line 19: Line 19:
 </WRAP> </WRAP>
  
-When compiling the corpus, the primary goal was for it to be as diverse and representative as possible, reflecting the variability of Czech in all of its modes and ranges of use (written, spoken, online communication) and featuring rich annotation (the texts were [[en:pojmy:lemma|lemmatized]], [[en:pojmy:tag|morphologically tagged]] using two different systems, and furthermore they were annotated for phrasemes and so-called [[http://ufal.mff.cuni.cz/nametag|named entities]]). As far as writtenness and spokeness are concerned, the Koditex is a mixed corpus.+When compiling the corpus, the primary goal was for it to be as diverse and representative as possible, reflecting the variability of Czech in all of its modes and ranges of use (written, spoken, online communication) and featuring rich annotation (the texts were [[en:pojmy:lemma|lemmatized]], [[en:pojmy:tag|morphologically tagged]] using two different systems, and furthermore they were annotated for phrasemes and so-called [[http://ufal.mff.cuni.cz/nametag|named entities]]). As far as writtenness and spokenness are concerned, the Koditex is a mixed corpus.
  
 The name //Koditex// is both an acronym of the Czech version of the phrase //**co**rpus of **di**versified **tex**ts// and a tribute to Vilém Kodýtek, author of a pioneering attempt to apply MDA to Czech based on the work of D. Biber.  The name //Koditex// is both an acronym of the Czech version of the phrase //**co**rpus of **di**versified **tex**ts// and a tribute to Vilém Kodýtek, author of a pioneering attempt to apply MDA to Czech based on the work of D. Biber.