AplikaceAplikace
Nastavení

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
en:manualy:korpusdb [2021/02/15 10:59] jankoceken:manualy:korpusdb [2021/02/15 11:15] (current) jankocek
Line 1: Line 1:
 ====== KorpusDB: Database of word forms and lemmas attested in the CNC corpora ====== ====== KorpusDB: Database of word forms and lemmas attested in the CNC corpora ======
  
-{{ :manualy:korpusdb_logo.png?direct&200|}}+{{ :manualy:korpusdb_logo.png?nolink&200|}}
  
 The database contains all recognized word forms of all lemmata that actually occur in any of the processed CNC corpora: [[cnk:syn:verze8|SYN v8]] (contemporary written Czech), [[cnk:oral|ORAL v1]] and [[cnk:ortofon|ORTOFON v1]] (contemporary spoken Czech), [[cnk:diakorp|DIAKORP v6]] and an unpublished corpus of 19th century texts. Since their lemmatization and POS-tagging may differ, internal versions of these corpora have been processed, using a common tagging. The database contains all recognized word forms of all lemmata that actually occur in any of the processed CNC corpora: [[cnk:syn:verze8|SYN v8]] (contemporary written Czech), [[cnk:oral|ORAL v1]] and [[cnk:ortofon|ORTOFON v1]] (contemporary spoken Czech), [[cnk:diakorp|DIAKORP v6]] and an unpublished corpus of 19th century texts. Since their lemmatization and POS-tagging may differ, internal versions of these corpora have been processed, using a common tagging.