Rozdíly
Zde můžete vidět rozdíly mezi vybranou verzí a aktuální verzí dané stránky.
| Obě strany předchozí revizePředchozí verzeNásledující verze | Předchozí verze | ||
| cnk:aibrown [2025/06/30 21:33] – vaclavcvrcek | cnk:aibrown [2025/10/13 14:10] (aktuální) – [How to cite AI-Brown] jirimilicka | ||
|---|---|---|---|
| Řádek 6: | Řádek 6: | ||
| <WRAP right 40%> | <WRAP right 40%> | ||
| - | ^ <fs medium> | + | ^ <fs medium> |
| ^ Positions ^ Number of positions (tokens) | 27 661 454 | | ^ Positions ^ Number of positions (tokens) | 27 661 454 | | ||
| ^ ::: ^ Number of positions (excl. punctuation) | 23 975 982 | | ^ ::: ^ Number of positions (excl. punctuation) | 23 975 982 | | ||
| Řádek 22: | Řádek 22: | ||
| - | The original reference BE21 Corpus was available in vertical format via the Czech National Corpus infrastructure. The preprocessing pipeline | + | The preprocessing pipeline for the original reference BE21 corpus |
| Each BE21 text sample was split into two parts to support controlled generation: | Each BE21 text sample was split into two parts to support controlled generation: | ||
| Řádek 53: | Řádek 53: | ||
| - | ==== How to cite AI-Koditex | + | ==== How to cite AI-Brown ==== |
| <WRAP round tip 70%> | <WRAP round tip 70%> | ||
| - | Milička, J. – Marklová, A. – Cvrček, V.// AI-Brown //. Department of Linguistics, | + | Milička, J. – Marklová, A. – Cvrček, V. (2025): //AI Brown and AI Koditex: LLM-Generated Corpora Comparable to Traditional Corpora of English and Czech Texts//. Arxiv preprint: [[https:// |
| + | |||
| + | Milička, J. – Marklová, A. – Cvrček, V.: //AI-Brown, version 1, 1. 7. 2025//. Department of Linguistics, | ||
| </ | </ | ||