AplikaceAplikace
Nastavení

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revisionBoth sides next revision
en:cnk:ortofon [2017/07/18 14:33] – [Corpus composition and data collection] michalkrenen:cnk:ortofon [2017/07/18 14:33] – [Corpus composition and data collection] michalkren
Line 20: Line 20:
  
 ===== Corpus composition and data collection  ===== ===== Corpus composition and data collection  =====
-The ORTOFON corpus is composed of 332 recordings from the years 2012–2017 and contains 1 014 786 orthographic words, i.e. a total of 1 236 508 positions; a total of 624 different speakers appear in the probes. The recordings were acquired in Bohemia, Moravia, and Silesia, and their total length measures almost 103 hours. More quantitative data can be found on the page dedicated to the [[en:cnk:struktura_ortofon|composition of the corpus]] (Czech only).+The ORTOFON corpus is composed of 332 recordings from the years 2012–2017 and contains 1 014 786 orthographic words, i.e. a total of 1 236 508 positions; a total of 624 different speakers appear in the probes. The recordings were acquired in Bohemia, Moravia, and Silesia, and their total length measures almost 103 hours. More quantitative data can be found on the page dedicated to the [[cnk:struktura_ortofon|composition of the corpus]] (Czech only).
  
 The material was collected in accordance with the [[en:cnk:oral2013#slozeni_korpusu_a_sber_dat|criteria]] concerning the corpora of the ORAL series. Due to the presence of the phonetic transcription tier, a greater emphasis was placed on the sound quality of recordings. The regional origin of the speakers who were included in the corpus is shown in the following map. The borders of the individual dialectal regions have been refined for the ORTOFON and DIALEKT corpora. The material was collected in accordance with the [[en:cnk:oral2013#slozeni_korpusu_a_sber_dat|criteria]] concerning the corpora of the ORAL series. Due to the presence of the phonetic transcription tier, a greater emphasis was placed on the sound quality of recordings. The regional origin of the speakers who were included in the corpus is shown in the following map. The borders of the individual dialectal regions have been refined for the ORTOFON and DIALEKT corpora.