Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revision | |
| en:cnk:uvod [2025/07/16 21:56] – [Corpora of the Czech National Corpus project] michalkren | en:cnk:uvod [2025/10/03 18:18] (current) – [Corpora of the Czech National Corpus project] michalkren |
|---|
| | [[en:cnk:nkjp|NKJP_1M]] | 1M | ✓ | ✓ | 2018 | manually annotated one-million subcorpus of the National Corpus of Polish | | | [[en:cnk:nkjp|NKJP_1M]] | 1M | ✓ | ✓ | 2018 | manually annotated one-million subcorpus of the National Corpus of Polish | |
| | [[en:cnk:obc|OBC]] | 24M | ✗ | ✓ | 2021 | [[http://fedora.clarin-d.uni-saarland.de/oldbailey/index.html|Old Bailey Corpus]], trial proceedings from 1720--1913 | | | [[en:cnk:obc|OBC]] | 24M | ✗ | ✓ | 2021 | [[http://fedora.clarin-d.uni-saarland.de/oldbailey/index.html|Old Bailey Corpus]], trial proceedings from 1720--1913 | |
| | ^ <fs large>Corpora generated by large language models (LLMs)</fs> ^^^^^^ |
| | ^ corpus ^ size (word count) ^ lemmas ^ morphological tags ^ year ^ characteristic features ^ |
| | | [[en:cnk:aibrown|AI Brown]] | 27M | ✓ | ✓ | 2025 | multi-genre corpus of English texts produced by LLMs | |
| | | [[en:cnk:aikoditex|AI Koditex]] | 21M | ✓ | ✓ | 2025 | multi-genre corpus of Czech texts produced by LLMs | |
| | |