Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
en:cnk:codit [2021/03/24 23:33] – michalkren | en:cnk:codit [2021/03/29 14:18] (current) – [CODIT corpus] michalkren | ||
---|---|---|---|
Line 5: | Line 5: | ||
{{ : | {{ : | ||
- | The CODIT corpus is a balanced diachronic corpus of written Italian of around 33 million tokens; it covers a period ranging from the earliest attestations of the Italian language (i.e. the XIII century) to 1947. Its structure recalls that shown by the [[http:// | + | The CODIT corpus is a balanced diachronic corpus of written Italian of around 33 million tokens. The corpus has been compiled by [[https:// |
The corpus is structured into five subcorpora, depending on the chronological period. The periodization follows that adopted for the MIDIA corpus: it is based on important linguistic and social facts of the Italian history. Particularly, | The corpus is structured into five subcorpora, depending on the chronological period. The periodization follows that adopted for the MIDIA corpus: it is based on important linguistic and social facts of the Italian history. Particularly, | ||
Line 29: | Line 29: | ||
**Table 1**: CODIT structure and size | **Table 1**: CODIT structure and size | ||
- | ===== How to cite ===== | + | ===== How to cite CODIT ===== |
<WRAP round tip 70%> | <WRAP round tip 70%> |