Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
en:pojmy:lexikalni_bohatost [2024/05/24 21:35] – alexandrrosen | en:pojmy:lexikalni_bohatost [2024/09/08 14:25] (current) – alexandrrosen | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== Lexical Diversity ====== | ====== Lexical Diversity ====== | ||
- | InterCorp release 16ud is annotated by two measures of lexical diversity. They are specified as metadata for each text of sufficient length, for each linguistically annotated language. In KonText, they can be displayed and queried like any other metadata items, such as author or sentence | + | InterCorp release 16ud is annotated by two measures of lexical diversity. They are specified as metadata for each text of sufficient length, for each linguistically annotated language. |
+ | |||
+ | |||
+ | In KonText, they can be displayed and queried like any other metadata items about a text, such as author or text ID. | ||
Line 7: | Line 10: | ||
* lexDivLemma: | * lexDivLemma: | ||
- | The measures are based on the type-token ratio measure. They show the average number of different types (word forms or lemmas) in a moving window of 1000 tokens. If the text has less than 1000 tokens, the measures are not defined. | + | The measures are based on the type-token ratio. They show the average number of different types (word forms or lemmas) in a moving window of 1000 tokens. If the text has less than 1000 tokens, the measures are not defined |
| | ||
+ | ===== References ===== | ||
+ | |||
+ | [[https:// | ||
+ | |||
+ | [[https:// | ||