Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revision | Last revisionBoth sides next revision |
en:cnk:syn2010 [2016/12/11 11:35] – [Composition of SYN2010] veronikapojarova | en:cnk:syn2010 [2016/12/11 16:26] – [Corpus SYN2010] veronikapojarova |
---|
| |
SYN2010 is a synchronic representative corpus of written Czech comprising 100 million tokens. It is a sequel to the corpora [[en:cnk:SYN2000]] and [[en:cnk:SYN2005]] and together with them forms a series of synchronic representative corpora that cover three successive periods. | SYN2010 is a synchronic representative corpus of written Czech comprising 100 million tokens. It is a sequel to the corpora [[en:cnk:SYN2000]] and [[en:cnk:SYN2005]] and together with them forms a series of synchronic representative corpora that cover three successive periods. |
**All corpora contain different texts and are therefore disjoint**. The basic characteristic freatures of the SYN2010 are identical to those of the corpus [[en:SYN2005|SYN2005]], which is predominantly related to the same conception of [[en:pojmy:reprezentativnost|representativeness]] based on the reception of written language and the resulting composition of the corpus. The SYN2010 corpus is [[en:pojmy:lemma|lemmatized]] and [[en:pojmy:tag|morphologically tagged]]. | **All corpora contain different texts and are therefore disjunctive**. The basic characteristic freatures of the SYN2010 are identical to those of the corpus [[en:SYN2005|SYN2005]], which is predominantly related to the same conception of [[en:pojmy:reprezentativnost|representativeness]] based on the reception of written language and the resulting composition of the corpus. The SYN2010 corpus is [[en:pojmy:lemma|lemmatized]] and [[en:pojmy:tag|morphologically tagged]]. |
| |
| |