| Both sides previous revisionPrevious revisionNext revision | Previous revision |
| en:cnk:ksp [2024/09/13 11:16] – [How to cite C3P] michalskrabal | en:cnk:ksp [2025/10/22 09:39] (current) – [Corpus of Contemporary Czech Poetry (C3P)] michalskrabal |
|---|
| ====== Corpus of Contemporary Czech Poetry (C3P) ====== | ====== Corpus of Contemporary Czech Poetry (C3P) ====== |
| |
| C3P is a joint project of the [[https://service.ucl.cas.cz/en/|Institute of Czech Literature of CAS]] and the Institute of the Czech National Corpus, dating back to 2015. As the name suggests, it is a corpus of contemporary Czech poetry texts (delimited by the years 1990 and 2020), i.e. a representative sample of Czech poetry over the last three decades. Significantly, this sample includes not only texts officially published in poetry books, and thus having gone through the standardeditorial process, but also amateur works, concentrated mainly on so-called literary forums. This methodological decision is not due to a desire to democratise poetry; we believe that without texts from the Internet, the picture of contemporary Czech poetry would not be complete, covering only one segment of poetry, which is relatively small in proportion. This would not correspond to the reality that literary forums have played a significant role in the Czech literary context((PIORECKÝ, Karel. Česká literatura a nová média. Praha: Academia, 2016.)), among other things as a platform for the publishing beginnings of some now established poets. This basic dichotomy, by the way, opens up the possibility of confronting and comparing the two modes, distinguished in C3P by the ''doc.medium'' attribute (print vs web). | C3P is a joint project of the [[https://service.ucl.cas.cz/en/|Institute of Czech Literature of CAS]] and the Institute of the Czech National Corpus, dating back to 2015. As the name suggests, it is a corpus of contemporary Czech poetry texts (delimited by the years 1990 and 2020), i.e. a representative sample of Czech poetry over the last three decades. Significantly, this sample includes not only texts officially published in poetry books, and thus having gone through the standard editorial process, but also amateur works, concentrated mainly on so-called literary forums. This methodological decision is not due to a desire to democratise poetry; we believe that without texts from the Internet, the picture of contemporary Czech poetry would not be complete, covering only one segment of poetry, which is relatively small in proportion. This would not correspond to the reality that literary forums have played a significant role in the Czech literary context((PIORECKÝ, Karel. Česká literatura a nová média. Praha: Academia, 2016.)), among other things, as a platform for the publishing beginnings of some now established poets. This basic dichotomy, by the way, opens up the possibility of confronting and comparing the two modes, distinguished in C3P by the ''doc.medium'' attribute (print vs web). |
| |
| <WRAP right 35%> | <WRAP right 35%> |
| ===== The composition of C3P ===== | ===== The composition of C3P ===== |
| |
| C3P currently contains approximately 37.5 million running words. The print poetry subcorpus includes about 2.7 million words, coming from 20,498 poems printed in 682 poetry collections by 256 authors. The web component of the corpus contains more than 280,000 poems from six literary forums (liter.cz, pismak.cz, totem.cz, libres.cz, psanci.cz, xxvi.cz), comprising over 34 million words. The texts in the print subcorpus were selected with regard to the generational layers of the contemporary poetry scene. Currently, authors of Generations X and Y and baby boomers (i.e. all those born after 1945) are represented here, as we continue to expand the corpus towards older generations. | C3P currently contains approximately 37.5 million running words. The print poetry subcorpus includes about 2.7 million words, coming from 27,675 poems printed in 682 poetry collections by 256 authors. The web component of the corpus contains more than 280,000 poems from six literary forums (liter.cz, pismak.cz, totem.cz, libres.cz, psanci.cz, xxvi.cz), comprising over 34 million words. The texts in the print subcorpus were selected with regard to the generational layers of the contemporary poetry scene. Currently, authors of Generations X and Y and baby boomers (i.e. all those born after 1945) are represented here, as we continue to expand the corpus towards older generations. |
| |
| For details on building C3P, see the studies below. | For details on building C3P, see the studies below. |
| C3P data can be investigated in various ways. In addition to standard concordance work in the [[https://www.korpus.cz/kontext/query?corpname=KSP|KonText interface]], other tools can be used: | C3P data can be investigated in various ways. In addition to standard concordance work in the [[https://www.korpus.cz/kontext/query?corpname=KSP|KonText interface]], other tools can be used: |
| |
| * [[https://trost.korpus.cz/slovo-v-poezii/|Word in Poetry]]: a tool suitable for the first introduction to the corpus. After entering a sought-after word, it offers previews of other applications and a range of statistical data. | * [[https://www.korpus.cz/slovo-v-poezii/|Word in Poetry]]: a tool suitable for the first introduction to the corpus. After entering a sought-after word, it offers previews of other applications and a range of statistical data. |
| * [[https://versologie.cz/ksp/tool_hex/index.php?lang=cz|Hex]]: an application allowing to search for keywords, i.e. those whose frequency is significantly higher in a given poem than in the whole corpus (thus, it is particularly useful for thematic analyses). | * [[https://versologie.cz/ksp/tool_hex/index.php?lang=cz|Hex]]: an application allowing to search for keywords, i.e. those whose frequency is significantly higher in a given poem than in the whole corpus (thus, it is particularly useful for thematic analyses). |
| * [[https://versologie.cz/ksp/tool_gunstick/index.php?lang=cz|Gunstick]]: a tool used to search for rhyme pairs and providing statistics on their frequency. | * [[https://versologie.cz/ksp/tool_gunstick/index.php?lang=cz|Gunstick]]: a tool used to search for rhyme pairs and providing statistics on their frequency. |
| ===== How to cite C3P ===== | ===== How to cite C3P ===== |
| <WRAP round tip 70%> | <WRAP round tip 70%> |
| Piorecký, K. – Škrabal, M. – Jeziorský, T.: Korpus současné poezie, verze 2.0 z 13. 9. 2024. Ústav Českého národního korpusu FF UK – Ústav pro českou literaturu AV ČR, v. v. i., Praha 2024. Dostupný z WWW http://www.korpus.cz | Piorecký, K. – Škrabal, M. – Jeziorský, T.: The Corpus of Contemporary Czech Poetry, version 2 from 13. 9. 2024. Ústav Českého národního korpusu FF UK – Ústav pro českou literaturu AV ČR, v. v. i., Praha 2024. Dostupný z WWW http://www.korpus.cz |
| |
| Škrabal, M. – Piorecký, K. – Procházka, P. – Jeziorský, T.: Korpus současné poezie, verze 1.0 z 29. 6. 2022. Ústav Českého národního korpusu FF UK – Ústav pro českou literaturu AV ČR, v. v. i., Praha 2022. Available from WWW: http://www.korpus.cz | Škrabal, M. – Piorecký, K. – Procházka, P. – Jeziorský, T.: The Corpus of Contemporary Czech Poetry, version 1 from 29. 6. 2022. Ústav Českého národního korpusu FF UK – Ústav pro českou literaturu AV ČR, v. v. i., Praha 2022. Available from WWW: http://www.korpus.cz |
| |
| Piorecký, K. – Škrabal, M.: Vícejazyčnost v současné české poezii. Několik úvodních postřehů z korpusové perspektivy. Slovenská literatura 6/2020, p. 568--583. | Piorecký, K. – Škrabal, M.: Vícejazyčnost v současné české poezii. Několik úvodních postřehů z korpusové perspektivy. Slovenská literatura 6/2020, p. 568--583. |