Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
en:cnk:diakorp [2015/11/04 16:59] – [The List of Texts of the DIACORP Corpus] annazitova | en:cnk:diakorp [2024/02/01 16:14] (current) – [Citing DIAKORP] michalkren | ||
---|---|---|---|
Line 4: | Line 4: | ||
Diakorp represents the diachronic section of the Czech National Corpus and aims to cover the texts of a total of seven centuries of the Czech language development. The first completed version (approximately 700 000 word forms) of the corpus was made accessible to the public in September 2005. Making the data public after the processing phase continues at a pace of about 250 000 word forms yearly. | Diakorp represents the diachronic section of the Czech National Corpus and aims to cover the texts of a total of seven centuries of the Czech language development. The first completed version (approximately 700 000 word forms) of the corpus was made accessible to the public in September 2005. Making the data public after the processing phase continues at a pace of about 250 000 word forms yearly. | ||
- | Due to the length of the time span aimed to be covered and due to the decision to include whole texts instead of samples, Diakorp was not designed to be a representative nor balanced corpus (whether in terms of register variability or period size). These aspects will be regarded in the CNC' | + | Due to the length of the time span aimed to be covered and due to the decision to include whole texts instead of samples, Diakorp was not designed to be a representative nor balanced corpus (whether in terms of register variability or period size). These aspects will be regarded in a new line of CNC diachronic corpora (in preparation). |
- | **The structure of Diakorp version 6 (percentage of tokens per century)** | + | **The structure of Diakorp, version 6 (released in 2015)** |
- | {{:en:cnk:slozeni_diakorpu6eng.png?direct|}} | + | {{:en:cnk:slozeni_diakorpu6eng2.png?nolink|}} |
Line 27: | Line 27: | ||
<WRAP round tip 70%> | <WRAP round tip 70%> | ||
- | Kučera, K. – Stluka, M.: // | + | Kučera, K. – Stluka, M.: // |
+ | |||
+ | Kučera, K. – Řehořková, | ||
+ | |||
+ | Kučera, K. (2014): Diachronní složka Českého národního korpusu a hranice možností korpusového výzkumu vývoje češtiny. //Naše řeč// 97 (4–5), 208–215. http:// | ||
</ | </ | ||