Corpus SPEECHES

Name Speeches
Number of positions (tokens) 248 839
Number of positions (tokens) without punctuation and other marks 217 314
Number of word forms (words) 30 909
Number of lemmata 12 522
Number of speeches 151
Number of sentences 11 208
Number of unique (different) speakers 14

Corpus of official presidential speeches has been created in cooperation between the CNC and University of Oslo. It covers Czech presidential speeches given between 1918 and 2015 on the occassion of anniversaries and public holidays (New Year, 28 October etc.). Speeches is a small and specialized corpus of written-to-be-spoken mode.

The corpus contains detailed structural markup describing the individual speeches and it is also lemmatized and tagged.

Citing Speeches

Cvrček, V. – Truneček, P. – Horký, V.: Korpus prezidentských projevů Speeches. Ústav Českého národního korpusu FF UK, Praha 2015. Available on-line: http://www.korpus.cz