AplikaceAplikace
Nastavení

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Last revisionBoth sides next revision
en:pojmy:frekvence [2016/12/15 14:46] – [The use and significance of frequency] vaclavcvrceken:pojmy:frekvence [2016/12/15 14:47] – [Measured and expected frequency] vaclavcvrcek
Line 34: Line 34:
   * //N// is the size of the corpus in numbers of [[en:pojmy:token|tokens]]   * //N// is the size of the corpus in numbers of [[en:pojmy:token|tokens]]
  
-We will never know the exact probability of the phenomenon in a population of all manifestations, but it can be approximated by the relative frequency discovered in previous comparisons using different data (other corpora). In the [[en:cnk:syn2005|SYN2005]] corpus we can therefore determine the probability of the occurrence of the [[en:pojmy:lemma|lemma]] //škola// from its frequency (f = 47872) and from the total size of the corpus (N = 122419382):+We will never know the exact probability of the phenomenon in a population of all manifestations, but it can be approximated by the relative frequency discovered in previous comparisons using different data (other corpora). In the [[en:cnk:syn2005|SYN2005]] corpus we can therefore determine the probability of the occurrence of the [[en:pojmy:lemma|lemma]] //škola// ('school'from its frequency (f = 47872) and from the total size of the corpus (N = 122419382):
  
 $ p(\text{škola}) = \frac{f(\text{škola})}{N} = \frac{47872}{122419382} = 0,0003910492 = 3,91 \cdot 10^{-4} $ $ p(\text{škola}) = \frac{f(\text{škola})}{N} = \frac{47872}{122419382} = 0,0003910492 = 3,91 \cdot 10^{-4} $