Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
en:cnk:codit [2021/03/24 23:07] – [CODIT corpus] michalkren | en:cnk:codit [2021/03/29 14:18] (current) – [CODIT corpus] michalkren | ||
---|---|---|---|
Line 5: | Line 5: | ||
{{ : | {{ : | ||
- | The CODIT corpus is a balanced diachronic corpus of written Italian of around 33 million tokens; it covers a period ranging from the earliest attestations of the Italian language (i.e. the XIII century) to 1947. Its structure recalls that shown by the [[http:// | + | The CODIT corpus is a balanced diachronic corpus of written Italian of around 33 million tokens. The corpus has been compiled by [[https:// |
The corpus is structured into five subcorpora, depending on the chronological period. The periodization follows that adopted for the MIDIA corpus: it is based on important linguistic and social facts of the Italian history. Particularly, | The corpus is structured into five subcorpora, depending on the chronological period. The periodization follows that adopted for the MIDIA corpus: it is based on important linguistic and social facts of the Italian history. Particularly, | ||
Line 25: | Line 25: | ||
^ scientifici| | ^ scientifici| | ||
^ teatro| | ^ teatro| | ||
- | ^ TOT| 4, | + | ^ TOTAL| 4, |
**Table 1**: CODIT structure and size | **Table 1**: CODIT structure and size | ||
- | ===== How to cite ===== | + | ===== How to cite CODIT ===== |
<WRAP round tip 70%> | <WRAP round tip 70%> |