Differences
This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
en:pojmy:syntakticka_komplexita [2024/05/24 20:28] – created alexandrrosen | en:pojmy:syntakticka_komplexita [2024/06/21 23:27] (current) – [Measures for sentences] alexandrrosen | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== Syntactic Complexity ====== | ====== Syntactic Complexity ====== | ||
- | InterCorp release 16ud is annotated by sethe veral measures of syntactic complexity. They are specified as metadata for each sentence and each text, for each linguistically annotated language. | + | InterCorp release 16ud is annotated by several |
+ | ===== Measures for sentences ===== | ||
- | | + | |
- | * maxNPDepth | + | Two measures (maxNPLength and maxNPDepth) concern noun phrases, defined as subtrees headed by words whose upos is NOUN, PNOM or PRON. |
- | * sLength | + | |
- | * subRatio | + | Except for the mdd measure, punctuation and coordination is excluded. |
- | * maxTreeDepth | + | |
- | * mdd | + | |
- | + | * maxNPDepth: number of embeddings in the noun phrase with the longest chain of embeddings | |
+ | * sLength: sentence length = no. of words in the sentence (punctuation excluded) | ||
+ | * subRatio: subordination ratio = (no. of T-units + no. of clauses) / no. of T-units((T-unit is a main clause including all its embedded/ | ||
+ | * maxTreeDepth: maximum number of clause embeddings (coordination does not count) | ||
+ | * mdd: mean dependency distance: average number of word boundaries between words and their heads | ||
+ | |||
+ | ===== Measures for texts ===== | ||
+ | |||
+ | The following measures are average values based on the measures for sentences. The mdd value is counted as the average for all words in the text. | ||
+ | |||
+ | * maxNPLengthAvg: | ||
+ | * maxNPDepthAvg: | ||
+ | * sLengthAvg: average sentence length = no. of words in the sentence (punctuation excluded) | ||
+ | * subRatioAvg: | ||
+ | * maxTreeDepthAvg: | ||
+ | * mdd: mean dependency distance: average number of word boundaries between words and their heads | ||
+ | |||
+ | In addition to syntactic complexity measures each text of sufficient length includes also two measures of [[en: |