Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revision | |||
en:cnk:totalita [2023/02/23 14:52] – [Totalita: Corpus of totalitarian language] michalkren | en:cnk:totalita [2023/02/23 14:53] (current) – [Totalita: Corpus of totalitarian language] michalkren | ||
---|---|---|---|
Line 4: | Line 4: | ||
The Totalita corpus is a diachronic corpus of written Czech covering the period of the communist regime (1948--1989), | The Totalita corpus is a diachronic corpus of written Czech covering the period of the communist regime (1948--1989), | ||
- | The corpus was taken over from the CD accompanying the dictionary, and neither the metadata nor the lemmatization and morphological mark-up have been changed. This means that the **annotation does not correspond to the contemporary standards** of annotation of the CNC corpora; yet, on the other hand, it made it possible to **preserve the results of demanding manual lemmatization** that has been carried out before the publication of the dictionary. | + | The corpus was taken over from the CD accompanying the dictionary, and neither the metadata nor the lemmatization and morphological mark-up have been changed. This means that the **annotation does not correspond to the contemporary standards** of annotation of the CNC corpora; yet, on the other hand, it made it possible to **preserve the results of demanding manual lemmatization** that has been carried out for the dictionary. |
<WRAP right 40%> | <WRAP right 40%> |