Corpus FSC2000

The FSC2000 Corpus is a reference source and a complement to the Frequency Dictionary of Czech (FSČ), which was published at the end of 2004 by NLN. The FSC2000 Corpus is based on the SYN2000 corpus and its development is described in Czech here. One of the consequences of this process is that the texts in the FSC2000 corpus are in fact a subset of texts in the SYN2000 corpus. The exact size of the FSC2000 corpus is 95 854 929 of word forms (without punctuation marks); the size of 114 363 813 corpus positions, provided by the corpus manager, is information including both the word forms and punctuation marks.

