====== ipm ====== The abbreviations **ipm** (instances per million) and **ppm** (parts per million) are measures of relative [[en:pojmy:frekvence|frequency]]. They express the average number of occurences of the unit or word in a hypothetical text/corpus with the size of 1 million words. Eg. The [[en:pojmy:word|node form]] //běžeckých// in the hundred million word corpus [[en:cnk:syn2010|SYN2010]] occurs 208 times, which is the equivalent of 1,72 ipm, i.e. 1,72 occurences per million words. ===== Use of ipm/ppm ===== The main advantage of relativization of frequency based on corpus size enables us to compare numbers from corpora of various sizes. In the case that corpora are not the same size, absolute values can cause confusion. The [[en:pojmy:word|node form]] //stromek// in the corpora [[en:cnk:syn2010|SYN2010]] and [[en:cnk:oral2008|ORAL2008]] reaches these values: ^ ^ SYN2010 ^ ORAL2008 ^ | Abs. frequency | 440 | 6 | | Rel. frequency (in ipm) | 3,62 | 4,45 | Regardless of what the absolute frequencies say, after considering the size of the corpora (SYN2010 has 122 mil. positions, whereas ORAL2008 has only 1,35 mil. positions), we find that the word //stromek// is relatively more frequent in the ORAL2008 corpus. ==== Relevant links==== [[en:pojmy:arf|ARF]] • [[en:pojmy:frekvence|Frequency]] • [[en:kurz:chvala_korpusu#sociolingvisticke_promenneucitele_zaci_a_vek|Example of ipm comparison]]