AplikaceAplikace
Nastavení

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revisionBoth sides next revision
en:pojmy:arf [2016/12/12 15:45] – created veronikapojarovaen:pojmy:arf [2016/12/12 15:51] – [Reduced frequency and ARF] veronikapojarova
Line 9: Line 9:
 Its definition is as follows: We use the letter //f// to label the frequency of a given word in the corpus. We divide the positions in the entire corpus into //f// sections of equal size. If the total number of words in the corpus should be divisible by //f//, the sections would be the same size; in the opposite case they may differ in one position. A reduced frequency is then the number of sections in which the given word occurs at least once. Its definition is as follows: We use the letter //f// to label the frequency of a given word in the corpus. We divide the positions in the entire corpus into //f// sections of equal size. If the total number of words in the corpus should be divisible by //f//, the sections would be the same size; in the opposite case they may differ in one position. A reduced frequency is then the number of sections in which the given word occurs at least once.
  
-První slovo z našeho příkladu bude mít redukovanou četnost buď 1, padnou-li všechny jeho výskyty do jednoho úseku, nebo 2, jestliže náhodou bude hranice mezi dvěma úseky uprostřed shluku výskytůDruhé slovo bude mít redukovanou četnost mnohem vyššíV krajním případě může být teoreticky redukovaná četnost stejná jako četnost, a to právě tehdy, když každý výskyt daného slova padne do jednoho úsekuPrakticky se toto většinou nestáváalespoň ne pro slova s vyšší četností.+The first word from our example will have a reduced frequency of either (if all of its occurrences fall under on section) or (if the boundary between two sections should happen to be in the middle of a cluster of occurrencesThe second word will have a much higher value for reduced frequencyIn extremely unlikely cases the reduced frequency could theoretically be the same as the frequencywhich would happen should every occurrence of given word fall under one single sectionThis very rarely happens in realityespecially as far as words with higher frequencies are concerned.
  
 The average reduced frequency (ARF) is then derived from the reduced frequency in the sense that it takes into account all possible compilations of the corpus (the order of the texts in it). It is calculated as an average of the reduced frequency from all possible compilations of the corpus. The average reduced frequency (ARF) is then derived from the reduced frequency in the sense that it takes into account all possible compilations of the corpus (the order of the texts in it). It is calculated as an average of the reduced frequency from all possible compilations of the corpus.