Mapka: map application for corpora of spoken Czech

The Mapka application is an interactive map of the Czech Republic used as a supplement to our corpora of spoken Czech ORAL, ORTOFON and DIALEKT. Mapka displays the dialect-based territorial division of the Czech-speaking language territory, overviews of typical dialectal features, municipality networks that have certain connections to the ORAL, ORTOFON and DIALEKT corpora (a network of municipalities where the recordings were produced or where the speakers come from) and samples of authentic speeches selected from these corpora.

Mapka is a web app accessible without registration and it is available at: https://korpus.cz/mapka/

Regional division

Mapka offers various types of regional divisions. The most important of these is the delimitation of dialectal regions for the Czech language, but it is possible to select a mode showing even more detailed categories: dialectal subcategories, sections, types. There is also an additional historical division which makes it possible to display Moravian and Silesian land border or German language islands in the Czech Republic. The map also provides the option of displaying the boundaries of the Czech Republic's administrative units, i.e. districts or regions.


Mapka can display municipality networks that have certain connections to the ORAL, ORTOFON and DIALEKT corpora. For example, this can be municipalities where the recordings were produced or where the speakers come from. Another option is displaying the network of research localities of the Atlas of the Czech Language.

Overviews of dialectal features

The map includes overviews of typical dialectal features pertaining to the regions of Bohemia, Moravia and Silesia, main dialectal regions and smaller dialectal areas or types, as well as the description of language situation in borderline areas. The descriptions are focused on phonological and morphological features.

Samples of spoken language

Each dialectal region is provided with number of samples featuring authentic discourses chosen from the ORAL, ORTOFON and DIALEKT corpora.
Samples chosen from the DIALEKT corpus demonstrate the most typical dialectal features of the given regions. For each dialectal region we selected examples from the old data collection set (the period between the 1950s and the 1980s) and examples from the new data collection set (from the 1990s until the present day). The samples consist of an audio recording and its two transcripts, dialectological and orthographic.
Samples chosen from the ORAL and ORTOFON corpora represent spontaneous informal speech in everyday informal situations (such as conversation at home, during meal, at work or in a restaurant). The speakers were chosen to originate from the region where the recording was made. Although they don't speak dialect, their speech contains regional features.

Searching and creating maps

The application includes the option of searching for municipalities in the Czech Republic and including them in the system for the division of dialectal regions or administrative units. Additional data can be displayed as well, such as the number of recordings and speakers connected with this municipality. Users can proceed to add these points to the map and in doing so create their own map that can be downloaded.

How to cite Mapka

Goláňová, H. – Waclawičová, M. – Pejcha, J. – Čapka, T. – Benešová, L. (2023): Mapka: map application for corpora of spoken Czech. Version 2.0. ÚČNK FF UK, Praha. Available at: http://korpus.cz/mapka/.