Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| en:cnk:orator [2025/05/28 13:36] – [ORATOR v3 (2025)] martinawaclawicova | en:cnk:orator [2025/06/06 13:40] (current) – [Morphological tagging of the ORATOR corpus] martinawaclawicova | ||
|---|---|---|---|
| Line 25: | Line 25: | ||
| The recordings were made at various locations in the Czech Republic or were downloaded from the internet with the consent of the speaker. Except for the 9 cases mentioned above, the recordings always capture the communication situation in the presence of the audience and in an authentic environment. The corpus is also not balanced by the gender of the speakers, with a predominance of men. | The recordings were made at various locations in the Czech Republic or were downloaded from the internet with the consent of the speaker. Except for the 9 cases mentioned above, the recordings always capture the communication situation in the presence of the audience and in an authentic environment. The corpus is also not balanced by the gender of the speakers, with a predominance of men. | ||
| + | |||
| + | ===== Morphological tagging of the ORATOR corpus ===== | ||
| + | |||
| + | The ORATOR v3 corpus is automatically [[en: | ||
| + | |||
| + | Substandard variants and forms typical of dialects and spontaneous speech are also tagged in the corpus (according to the ORTOFON corpus, see [[en: | ||
| + | |||
| + | The following specific tags are used in the first tag position (word type): | ||
| + | |||
| + | ^ Tag ^ Meaning | ||
| + | | E | fragments (incomplete words) | | ||
| + | | H | nonverbal sounds (e.g. hezitation) | | ||
| + | | M | comments by transcribers (in round brackets) | | ||
| + | | W | anonymised sections (mainly names) | | ||
| + | |||
| + | Note: The anonymised sections are specified on a basic level '' | ||
| + | |||
| + | The ORATOR v2 corpus is tagged with the prior morphological tagset used until 2020. Detailed information on the annotation of these previously published corpora can be found on a [[en: | ||
| ====== ORATOR v1 (2019) ====== | ====== ORATOR v1 (2019) ====== | ||
| Line 40: | Line 58: | ||
| <WRAP round tip 70%> | <WRAP round tip 70%> | ||
| - | Kopřivová, | + | Kopřivová, |
| - | Kopřivová, | + | Kopřivová, |
| - | Kopřivová, | + | Kopřivová, |
| </ | </ | ||