Mapka: a map application for working with corpora of spoken Czech

The Mapka application is a supplement to our corpora of spoken Czech. Its main function is to display the regions represented in the DIALEKT corpus using an interactive map of the Czech Republic. The next phase will include the addition of data from other spoken Czech corpora, for example ORTOFON.

Mapka is a web app accessible without registration and it is available at: https://korpus.cz/mapka/

Regional division

The background map offers various types of regional divisions. The most important of these is the delimitation of dialectal regions for the Czech language, but it is possible to select a mode showing even more detailed categories: dialectal subcategories, sections, types. There is also an additional historical division which makes it possible to display Moravian and Silesian land border or German language islands in the Czech Republic. If needed, the map also provides the option of displaying the boundaries of the Czech Republic's administrative units, i.e. districts or regions.


Furthermore, the background map can display municipality networks that have certain connections to the DIALEKT corpus. For example, this can be municipalities where recordings of dialectal language were produced, and which were then included in the current version of the DIALEKT corpus, or alternatively it can be municipalities where overall data collection for this corpus took place. Another option is displaying the network of research localities in the Atlas of the Czech Language.

Overviews of dialectal features

The map includes overviews of typical dialectal features pertaining to the 3 regions (Bohemia, Moravia and Silesia) and 8 basic dialectal regions, as well as the description of language situation in 2 borderline areas. We have mainly focused on phonological and morphological features. Examples of dialectal phenomena were primarily selected from the current version of the DIALEKT corpus, and supplementary examples were also taken from transcripts which will be published in the upcoming version of the corpus.

Samples of dialects in discourse

Each dialectal region is provided with two samples featuring authentic discourses in the dialects of the speakers from the DIALEKT corpus, including an audio recording of the given segment. The samples were chosen in order to demonstrate the most typical dialectal features of the given regions. For each dialectal region we selected one example from the old data collection set (the period between the 1950s and the 1980s) and one from the new data collection set (from the 1990s until the present day). We will be gradually adding examples of dialects in discourse for other prominent dialectal sections and types pertaining to the relevant dialectal regions.

Searching and creating maps

The application includes the option of searching for municipalities in the Czech Republic and including them in the system for the division of dialectal regions. Users can proceed to add these points to the map and in doing so create their own map. The resulting maps and layers with user-defined markers can then be downloaded, saved and printed.

How to cite Mapka

Goláňová, H. – Waclawičová, M. – Pejcha, J. (2021): Mapka: map application for corpora of spoken Czech. Version 1.1. FF UK, Praha. Available at: <http://korpus.cz/mapka/>.