AplikaceAplikace
Nastavení

Rozdíly

Zde můžete vidět rozdíly mezi vybranou verzí a aktuální verzí dané stránky.

Odkaz na výstup diff

Obě strany předchozí revizePředchozí verze
Následující verze
Předchozí verze
cnk:aibrown [2025/07/16 15:54] – [How to cite AI-Koditex] michalkrencnk:aibrown [2025/10/13 14:10] (aktuální) – [How to cite AI-Brown] jirimilicka
Řádek 6: Řádek 6:
  
 <WRAP right 40%> <WRAP right 40%>
-^ <fs medium>Name</fs> ^^ <fs medium>AI-Brown</fs> ^+^ <fs medium>Name</fs> ^^ <fs medium>AI-Brown v1</fs> ^
 ^ Positions ^ Number of positions (tokens) |  27 661 454 |   ^ Positions ^ Number of positions (tokens) |  27 661 454 |  
 ^ ::: ^ Number of positions (excl. punctuation) |  23 975 982 | ^ ::: ^ Number of positions (excl. punctuation) |  23 975 982 |
Řádek 22: Řádek 22:
  
  
-The original reference BE21 Corpus was available in vertical format via the Czech National Corpus infrastructure. The preprocessing pipeline included several steps to prepare the data for prompt-based generation. Clean texts and metadata were extracted from the verticals, and structural tags were aligned with the Czech corpus format to ensure cross-linguistic consistency.+The preprocessing pipeline for the original reference BE21 corpus included several steps to prepare the data for prompt-based generation. Clean texts and metadata were extracted from the verticals, and structural tags were aligned with the Czech corpus format to ensure cross-linguistic consistency.
  
 Each BE21 text sample was split into two parts to support controlled generation: Each BE21 text sample was split into two parts to support controlled generation:
Řádek 53: Řádek 53:
  
  
-==== How to cite AI-Koditex ====+==== How to cite AI-Brown ====
  
 <WRAP round tip 70%> <WRAP round tip 70%>
-Milička, J. – Marklová, A. – Cvrček, V. //AI-Brown//. Department of Linguistics, Faculty of Arts, Charles University, Prague 2025. Available at WWW: www.korpus.cz+Milička, J. – Marklová, A. – Cvrček, V. (2025): //AI Brown and AI Koditex: LLM-Generated Corpora Comparable to Traditional Corpora of English and Czech Texts//. Arxiv preprint: [[https://arxiv.org/abs/2509.22996]] 
 + 
 +Milička, J. – Marklová, A. – Cvrček, V.: //AI-Brown, version 1, 1. 7. 2025//. Department of Linguistics, Faculty of Arts, Charles University, Prague 2025. Available at WWW: www.korpus.cz
 </WRAP> </WRAP>