Next revision | Previous revision |
en:cnk:psalm77 [2023/01/17 11:27] – created jankocek | en:cnk:psalm77 [2023/01/17 12:42] (current) – michalkren |
---|
| |
===About the Project=== | ===About the Project=== |
| |
The Psalm 77 corpus is the result of a pilot project carried out in autumn 2022 which aligns all sixteenth-century Romanian versions of psalm 77 as well as the Slavonic and the Greek texts of the same psalm. The corpus was compiled thanks to the technical support of the Institute of the Czech National Corpus ( [[https://ucnk.ff.cuni.cz/en/institute/people/david-lukes-2/|David Lukeš]] and [[https://ucnk.ff.cuni.cz/en/institute/people/pavel-vondricka/|Pavel Vondřička]] in particular) and the financial support of the [[https://www.dariah.eu/2021/11/02/cls-infra-tna-fellowship-opportunities/|CLS INFRA Translational Access Fellowships]] (TNA) scheme by Constanța Burlacu. Overall, the project deals with 72 verses of text per source and 10 thousand tokens in total. The project’s aim was to create a corpus that allows the visualisation of multiple sources at the same time, being they Romanian, Slavic or Greek, and to annotate linguistically the Romanian material. Much attention was given to the textual processing of the Romanian materials and the following decisions were taken: | The Psalm 77 corpus is the result of a pilot project carried out in autumn 2022 which aligns all sixteenth-century Romanian versions of psalm 77 as well as the Slavonic and the Greek texts of the same psalm. The corpus was compiled thanks to the technical support of the Institute of the Czech National Corpus ( [[https://ucnk.ff.cuni.cz/en/institute/people/david-lukes-2/|David Lukeš]] and [[https://ucnk.ff.cuni.cz/en/institute/people/pavel-vondricka/|Pavel Vondřička]] in particular) and the financial support of the [[https://www.dariah.eu/2021/11/02/cls-infra-tna-fellowship-opportunities/|CLS INFRA Translational Access Fellowships]] (TNA) scheme by Constanța Burlacu. Overall, the project deals with 72 verses of text per source and 10 thousand tokens in total. The project’s aim was to create a corpus that allows the visualisation of multiple sources at the same time, being they Romanian, Slavic or Greek, and to annotate linguistically the Romanian material. Much attention was given to the textual processing of the Romanian materials and the following decisions were taken: |
* Each Romanian source underwent a transcription process which captured the linguistic information related to psalm 77 and rendered it computer readable by maintaining the texts’ original script, that is, the Cyrillic letter; | * Each Romanian source underwent a transcription process which captured the linguistic information related to psalm 77 and rendered it computer readable by maintaining the texts’ original script, that is, the Cyrillic letter; |
* The linguistic data was further processed through a stage of transliteration of the Cyrillic script and a normalization of data according to the modern orthographic rules of modern Romanian; | * The linguistic data was further processed through a stage of transliteration of the Cyrillic script and a normalization of data according to the modern orthographic rules of modern Romanian; |
* The linguistic data obtained in such manner was automatically lemmatised and annotated via the UDPipe platform and the UD annotation model for [[https://universaldependencies.org/treebanks/ro_nonstandard/index.html|Nonstandard Romanian]]. | * The linguistic data obtained in such manner was automatically lemmatised and annotated via the UDPipe platform and the UD annotation model for [[https://universaldependencies.org/treebanks/ro_nonstandard/index.html|Nonstandard Romanian]]. |
To explore the corpus, the user can either search by its lemmas (written with Latin letters and modern Romanian orthography) or by the transcribed, transliterated, or normalized versions of specific words (in the defaul attribute box labelled as word, trans, norm). The search will retun the transcribed version of the text with Cyrillic script. | To explore the corpus, the user can either search by its lemmas (written with Latin letters and modern Romanian orthography) or by the transcribed, transliterated, or normalized versions of specific words (in the default attribute box labelled as word, trans, norm). The search will return the transcribed version of the text with Cyrillic script. |
| |
===Sources=== | ===Sources=== |
The current version of Psalm 77 corpus does not offer a linguistic annotation and lemmatization of the Slavonic and Greek versions. The reference point for the Slavonic version of Psalm 77 has been the Tomić Psalter (Tomic in the corpus); while the Greek text was taken from critical edition of the [[https://www.academic-bible.com/en/online-bibles/septuagint-lxx/read-the-bible-text/bibel/text/lesen/stelle/19/10001/19999/ch/20e1ddb407868954ab45d3a35ac26749/|Septuagint]] curated by Alfred Rahlfs and Robert Hanhart. Additionally, for each bilingual Church Slavonic-Romanian source (Voroneț and Ciobanu manuscripts and the printed edition of 1588 circa), the Slavonic counterpart was transcribed and used as an additional textual source (the same sigla as for the Romanian sources was used, to which ‘-Sl’ was added. See for example PCbSl). It should also be pointed out that among the list of sources used for the compilation of the corpus, the printed edition of the bilingual Psalter of 1577 (PC1) was not included. This edition is fairly similar to that of 1570 (PC) and the Ciobanu Psalter. | |
| The current version of Psalm 77 corpus does not offer a linguistic annotation and lemmatization of the Slavonic and Greek versions. The reference point for the Slavonic version of Psalm 77 has been the Tomić Psalter (''Tomic'' in the corpus); while the Greek text (''Gr'' in the corpus) was taken from critical edition of the [[https://www.academic-bible.com/en/online-bibles/septuagint-lxx/read-the-bible-text/bibel/text/lesen/stelle/19/10001/19999/ch/20e1ddb407868954ab45d3a35ac26749/|Septuagint]] curated by Alfred Rahlfs and Robert Hanhart. Additionally, for each bilingual Church Slavonic-Romanian source (Voroneț and Ciobanu manuscripts and the printed edition of 1588 circa), the Slavonic counterpart was transcribed and used as an additional textual source (the same sigla as for the Romanian sources was used, to which ''-Sl'' was added. See for example PCbSl). It should also be pointed out that among the list of sources used for the compilation of the corpus, the printed edition of the bilingual Psalter of 1577 (PC1) was not included. This edition is fairly similar to that of 1570 (PC) and the Ciobanu Psalter. |
| |
The Romanian textual sources and their sigla used in the project are the following: | The Romanian textual sources and their sigla used in the project are the following: |
| PS |The Scheia Psalter, mid or later 16th century, MS 449 of the Romanian Academy Library; digital copy available at https://medievalia.com.ro/manuscrise/item/ms-rom-449.| | | PS |The Scheia Psalter, mid or later 16th century, MS 449 of the Romanian Academy Library; digital copy available at https://medievalia.com.ro/manuscrise/item/ms-rom-449.| |
| |
For additional question, contact Constanța Burlacu at: **constanta.burlacu8@gmail.com**. | Overall, there are 11 sources (6 Romanian, 4 Slavonic, 1 Greek) that make up the Psalm 77 parallel corpus. The texts can be searched either as monolingual or parallel corpora with any combination of sources using the [[https://wiki.korpus.cz/doku.php/en:manualy:kontext:index|KonText]] query interface, e.g. like [[https://www.korpus.cz/kontext/query?corpname=psalm77_PCb&align=psalm77_PCbSl&align=psalm77_Gr|this]]. |
| |
| For additional question, contact Constanța Burlacu at: **constanta.burlacu8** at **gmail.com**. |
| |