AplikaceAplikace
Nastavení

This is an old revision of the document!


NET Corpus

NET corpus is the first version of a synchronic corpus of Czech semi-official internet communication. It is currently composed of two parts, discussion forums and blogs. The data coverage of NET will increase in the future. As one of the aims of NET is to map the selected areas of internet communication, NET tries to capture the selected domain from its beginning, and at the same time, NET will concentrate also on its future content that will be included in future version of the corpus, so that NET could capture its change over time.

Discussion forums

This part of the corpus is concentrated on discussion forums run on the phpBB platform. For the time being, there are neither comments and discussions to the given article or social network data included in NET. The sampling of the phpBB platform forums has been random, the sample size is planned to be increased in the future.

Personal blogs

Jedná se většinou o vedlejší součást zpravodajských serverů nebo internetových magazínů (webové stránky s kategorií blogů), korpus tedy nezachycuje firemní ani jiné formálně psané blogy. Výběr tvoří nejpopulárnější / nejfrekventovanější zástupci webových stránek.