

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
en:pojmy:arf [2016/12/12 16:21] – [ARF values] veronikapojarovaen:pojmy:arf [2016/12/12 16:22] (current) – [ARF values] veronikapojarova
Line 29: Line 29:
 The value of ARF for high frequency expressions with an even distribution of occurrences is approximately a third of their frequency (but specifically only for frequencies over 50 000), however for technical terms occurring only in several documents it can be significantly (10 to 100 times) lower than the frequency. ARF is in comparison to the frequency much less sensitive to the (non-)inclusion of specific texts in the corpus, and therefore corresponds better to the intuitive understanding of "common words". The value of ARF for high frequency expressions with an even distribution of occurrences is approximately a third of their frequency (but specifically only for frequencies over 50 000), however for technical terms occurring only in several documents it can be significantly (10 to 100 times) lower than the frequency. ARF is in comparison to the frequency much less sensitive to the (non-)inclusion of specific texts in the corpus, and therefore corresponds better to the intuitive understanding of "common words".
-ARF became known in the Czech environment thanks to its implementation in the former corpus manager [[en:pojmy:korpusovy_manazer|Manatee/Bonito]] (today in the [[en:manualy:kontext:index|KonText]] interface), and did well in comparison with other commonly used adjusted frequencies and dispersion rates.((Gries, S. T.: //Dispersions and adjusted frequencies in corpora//. In International Journal of Corpus Linguistics 13, 2008, 403–437.)) Apart from this, the ARF was proven to work in practice as the main criterion for determining word commonness in the compilation of both osvědčila jako hlavní kritérium pro stanovení běžnosti slov při sestavování obou nejnovějších frekvenčních slovníků češtiny.+ARF became known in the Czech environment thanks to its implementation in the former corpus manager [[en:pojmy:korpusovy_manazer|Manatee/Bonito]] (today in the [[en:manualy:kontext:index|KonText]] interface), and did well in comparison with other commonly used adjusted frequencies and dispersion rates.((Gries, S. T.: //Dispersions and adjusted frequencies in corpora//. In International Journal of Corpus Linguistics 13, 2008, 403–437.)) Apart from this, the ARF was proven to work in practice as the main criterion for determining word commonness in the compilation of both the newest frequency dictionaries of Czech.
 --- //M. Křen, V. Cvrček// --- //M. Křen, V. Cvrček//