AplikaceAplikace
Nastavení

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
en:cnk:online:gen2 [2022/12/22 16:13] – [duplicate] vaclavcvrceken:cnk:online:gen2 [2026/01/23 11:45] (current) – [Annotation] krivan
Line 2: Line 2:
 ====== ONLINE2 (2nd generation) ====== ====== ONLINE2 (2nd generation) ======
  
-ONLINE2_NOW and ONLINE2_ARCHIVE are two corpora which together create a monitor corpus ([[en:cnk:online|ONLINE]]) of the dynamic content of the Czech web, i.e. internet journalism. The span of the corpus is since April 2021 till the present. It has been created at the CNC with the help of the data kindly provided by the [[https://monitora.cz/|Mopnitora]] company.+ONLINE2_NOW and ONLINE2_ARCHIVE are two corpora which together create a monitor corpus ([[en:cnk:online|ONLINE]]) of the dynamic content of the Czech web, i.e. internet journalism. The span of the corpus is since April 2021 till the present. It has been created at the CNC with the help of the data kindly provided by the [[https://monitora.cz/|Monitora]] company.
  
 Both corpora differ in their extent and periodicity of updates: Both corpora differ in their extent and periodicity of updates:
Line 75: Line 75:
 ===== Annotation ===== ===== Annotation =====
  
-The corpus is annotated using standard tools for the [[en:pojmy:morfologicka_analyza|morphological analysis]] and [[en:pojmy:lemma|lemmatization]] of the SYN-series corpora. The annotation is thus comparable e.g. with the [[en:cnk:syn2015|SYN2015]] corpus.+Morphological tagging, lemmatization, and tokenization of the corpus are performed fully automatically according to the [[en:cnk:anotacni_standard_cnk|unified CNC annotation scheme]].
  
 ====== How to cite ONLINE ====== ====== How to cite ONLINE ======