AplikaceAplikace
Nastavení

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
en:cnk:orator [2019/12/20 15:51] michalkrenen:cnk:orator [2021/03/08 13:26] (current) – [How to cite] zuzanakomrskova
Line 2: Line 2:
  
 <WRAP right 35%> <WRAP right 35%>
-^ <fs medium>Name</fs> | <fs medium>[[cnk:orator|ORATOR]]</fs>+^ <fs medium>Name</fs> | <fs medium>[[cnk:orator|ORATOR]]•v1</fs> | <fs medium>[[cnk:orator|ORATOR]]•v2</fs> | 
-^ Number of [[pojmy:token|positions (tokens)]] | 736 407 |   +^ Number of [[pojmy:token|positions (tokens)]] | 736 407 | 1 535 609 | 
-^ Number of [[pojmy:token|positions (tokens)]] without puctuation, hesitations and interjections | 578 398 | +^ Number of [[pojmy:token|positions (tokens)]] without puctuation, hesitations and interjections | 578 398 | 1 207 255 
-^ Number of [[pojmy:word| word forms (word)]] | 60 952 |   +^ Number of [[pojmy:word| word forms (word)]] | 60 952 | 97 816 | 
-^ Number of [[pojmy:atributy_strukturni#struktura_korpusu_mluvene_cestiny|conversations recorded]] | 318 | +^ Number of [[pojmy:atributy_strukturni#struktura_korpusu_mluvene_cestiny|conversations recorded]] | 318 | 489 
-^ Number of [[pojmy:atributy_strukturni#struktura_korpusu_mluvene_cestiny|utterances]] | 68 727 | +^ Number of [[pojmy:atributy_strukturni#struktura_korpusu_mluvene_cestiny|utterances]] | 68 727 | 147 867 
-^ Number of unique (different) speakers| 332 |   +^ Number of unique (different) speakers| 332 | 468 | 
-^ Length of recordings [hh:mm:ss.ms] | 72:07:47.368 |  +^ Length of recordings [hh:mm:ss.ms] | 72:07:47.368 | 148:51:51.56 |
 </WRAP> </WRAP>
  
Line 15: Line 15:
  
 Transcription rules, linking to the corresponding audio track and most metadata follow the [[en:cnk:ortofon|ORTOFON]] and [[en:cnk:oral|ORAL]] corpora, structural attributes used in ORATOR are described [[pojmy:atributy_strukturni|here]] (Czech only). The corpus is [[en:cnk:lemtag_mluv|lemmatized and morphologically tagged]] in the same way as the ORAL and ORTOFON corpora. Transcription rules, linking to the corresponding audio track and most metadata follow the [[en:cnk:ortofon|ORTOFON]] and [[en:cnk:oral|ORAL]] corpora, structural attributes used in ORATOR are described [[pojmy:atributy_strukturni|here]] (Czech only). The corpus is [[en:cnk:lemtag_mluv|lemmatized and morphologically tagged]] in the same way as the ORAL and ORTOFON corpora.
 +
 +An updated version 2 of this corpus was published in 2020, with more than twice as much data and featuring many small improvements in the consistency of the transcription and in the annotation of the corpus.
  
 ===== How to cite ===== ===== How to cite =====
  
 <WRAP round tip 70%> <WRAP round tip 70%>
-Kopřivová, M. – Laubeová, Z.  –  Lukeš, D.  –  Poukarová, P.: //ORATOR: Korpus monologů//. Ústav Českého národního korpusu FF UK, Praha 2019 dostupný z: [[https://www.korpus.cz]].+Kopřivová, M. – Laubeová, Z. – Lukeš, D. – Poukarová, P.: //ORATOR v2: Korpus monologů//. Ústav Českého národního korpusu FF UK, Praha 2020. Retrieved from [[https://www.korpus.cz]]. 
 + 
 +Kopřivová, M. – Laubeová, Z. – Lukeš, D. – Poukarová, P.: //ORATOR v1: Korpus monologů//. Ústav Českého národního korpusu FF UK, Praha 2019. Retrieved from [[https://www.korpus.cz]].
 </WRAP> </WRAP>