Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision | ||
en:cnk:net [2020/11/12 11:21] – Vložena úvodní tabulka jeziorsky | en:cnk:net [2021/02/10 14:15] – [How to cite] michalkren | ||
---|---|---|---|
Line 3: | Line 3: | ||
====== NET Corpus ====== | ====== NET Corpus ====== | ||
- | <WRAP right 35%> | + | <WRAP right 45%> |
- | ^ <fs medium> | + | ^ <fs medium> |
- | ^ [[en: | + | ^ [[en: |
- | ^ ::: ^ Number of [[en: | + | ^ ::: ^ Number of [[en: |
- | ^ ::: ^ Number of [[en: | + | ^ ::: ^ Number of [[en: |
- | ^ [[en: | + | ^ [[en: |
- | ^ ::: ^ Number of [[en: | + | ^ ::: ^ Number of [[en: |
- | ^ ::: ^ Number of paragraphs <p> | 267 026 | | + | ^ ::: ^ Number of paragraphs <p> | 267 026 | 1 817 088 | |
- | ^ ::: ^ Number of sentences <s> | 2 622 636 | | + | ^ ::: ^ Number of sentences <s> | 2 622 636 | 8 905 016 | |
- | ^ Further Information ^ [[en: | + | ^ Further Information ^ [[en: |
- | ^ ::: ^ [[en: | + | ^ ::: ^ [[en: |
- | ^ ::: ^ Year of publication | | + | ^ ::: ^ Year of publication | |
</ | </ | ||
Line 26: | Line 26: | ||
Personal blogs have been downloaded mostly from news servers and web magazines where they often form a supplementary part of the main web. There are no corporate or other formal blogs included in the NET corpus. | Personal blogs have been downloaded mostly from news servers and web magazines where they often form a supplementary part of the main web. There are no corporate or other formal blogs included in the NET corpus. | ||
+ | |||
+ | ===== Version 2 ===== | ||
+ | |||
+ | NET corpus at version 2 has been improved mainly with regard to the content size. Updated data from the year 2020 has been added, also the number of scraped blogs and forums has been significantly increased (currently more than 120 domains), which has increased the overall balance of the downloaded data. | ||
===== How to cite ===== | ===== How to cite ===== | ||
<WRAP round tip 70%> | <WRAP round tip 70%> | ||
- | Jeziorský, T.: //NET: korpus polooficiální internetové komunikace// | + | Jeziorský, T.: //NET v1: korpus polooficiální internetové komunikace// |
+ | |||
+ | Jeziorský, T.: //NET v2: korpus polooficiální internetové komunikace// | ||
</ | </ | ||