Welcome to the Czech National Corpus wiki
The Czech National Corpus (CNC) project was established at the Faculty of Arts, Charles University in 1994 with the aim of creating general-purpose national language corpora.
In 2012, the importance of the CNC resources and services was recognised by the Ministry of Education, Youth and Sports and CNC has since been funded as a Large research infrastructure within the framework of the LM programme, currently as project LM2023044 (2023-2026). This enables the CNC to provide comprehensive user services including continuous data mapping of Czech, application development and many-faceted user support.
This wiki serves CNC users not only as a source of information about the CNC (description of public corpora and their documentation, application manuals), but also as a continuously edited database of corpus linguistic knowledge. The main parts of the wiki consist of the following:
A language corpus is an extensive collection of authentic textual data (written or spoken) converted to electronic form in a uniform format, meaning that it can easily be searched for various linguistic phenomena – especially words and phrases (collocations). Corpora differ from a plain text archive or database primarily because they have been carefully compiled with the research purpose in mind (they should, for example, represent contemporary spoken language or written language or one of its parts, e.g. journalistic texts). A corpus displays linguistic phenomena in their natural context, which allows us to do language research based on actual data on a scale so large that it would have previously been unthinkable.
The Helpdesk is available to all users, who are invited to post questions concerning work with the CNC (creating queries, corpus specifics etc.). The majority of the questions is answered within one work day.
The user support centre also includes error reports in CNC applications and sending improvement suggestions. The link to a form intended for such reports can be found at the very bottom of every application – “Report an error”.