~~NOTOC~~
Welcome to the Czech National Corpus wiki
The [[http://www.korpus.cz|Czech National Corpus]] (CNC) project was established at the [[https://www.ff.cuni.cz/home/|Faculty of Arts]], [[https://cuni.cz/UKEN-1.html|Charles University]] in 1994 with the aim of creating general-purpose national language corpora. In 2012, the importance of the CNC resources and services was recognised by the [[https://msmt.gov.cz/?lang=2|Ministry of Education, Youth and Sports]] and CNC has since been funded as a **[[https://www.vyzkumne-infrastruktury.cz/en/social-sciences-and-humanities/cnc/|Large research infrastructure]]** within the framework of the LM programme, currently as project LM2023044 (2023-2026). This enables the CNC to provide comprehensive user services including continuous data mapping of Czech, application development and many-faceted user support. ===== What information will you find here? ===== This wiki serves CNC users not only as a source of information about the CNC (description of public corpora and their documentation, application manuals), but also as a continuously edited database of corpus linguistic knowledge. The main parts of the wiki consist of the following:
Manuals for CNC applications
Overview of corpora available within the CNC
Tutorial for working with the EEBO in 8 lessons
Index of basic concepts in corpus linguistics (in Czech)
List of sources and abbreviations (in Czech)
===== Frequently searched pages ===== ==== Manuals for CNC applications ====
==== Useful links ==== * [[en:kurz:zaciname|How to begin working with the CNC (registration, types of access)]] * [[en:cnk:citace|How to cite CNC corpora and tools]] * [[en:pojmy:regularni_vyrazy|Regular expressions]] ===== What is a corpus? ===== A language corpus is an extensive collection of **authentic textual data** (written or spoken) converted to **electronic form** in a uniform format, meaning that it can easily be **searched** for various linguistic phenomena -- especially words and phrases (collocations). Corpora differ from a plain text archive or database primarily because they have been carefully compiled with the research purpose in mind (they should, for example, represent contemporary spoken language or written language or one of its parts, e.g. journalistic texts). A corpus displays linguistic phenomena in their **natural context**, which allows us to do language research based on actual data on a scale so large that it would have previously been unthinkable. ===== User support ===== The [[https://podpora.korpus.cz/projects/poradna/boards|Helpdesk]] is available to all users, who are invited to post questions concerning work with the CNC (creating queries, corpus specifics etc.). The majority of the questions is answered within one work day. The user support centre also includes error reports in CNC applications and sending improvement suggestions. The link to a form intended for such reports can be found at the very bottom of every application -- "//Report an error//". ---- [[en:kurz:zaciname|How to begin working with the CNC]] • [[en:manualy:kontext:index|KonText interface manual]] • [[en:cnk:uvod|CNC corpora]] • [[en:kurz:uvod|Tutorial for working with EEBO in 8 lessons]] • [[en:cnk:citace|How to cite]]