Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
en:pojmy:syntakticka_komplexita [2024/05/24 21:02] – [Measures for sentences] alexandrrosen | en:pojmy:syntakticka_komplexita [2024/06/21 23:27] (current) – [Measures for sentences] alexandrrosen | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== Syntactic Complexity ====== | ====== Syntactic Complexity ====== | ||
- | InterCorp release 16ud is annotated by several measures of syntactic complexity. They are specified as metadata for each sentence and each text, for each linguistically annotated language. | + | InterCorp release 16ud is annotated by several measures of syntactic complexity. They are specified as metadata for each sentence and each text, for each linguistically annotated language. In KonText, they can be displayed and queried like any other metadata items, such as author or sentence ID. |
===== Measures for sentences ===== | ===== Measures for sentences ===== | ||
+ | |||
+ | |||
+ | Two measures (maxNPLength and maxNPDepth) concern noun phrases, defined as subtrees headed by words whose upos is NOUN, PNOM or PRON. | ||
+ | |||
+ | Except for the mdd measure, punctuation and coordination is excluded. | ||
* maxNPLength: | * maxNPLength: | ||
* maxNPDepth: number of embeddings in the noun phrase with the longest chain of embeddings | * maxNPDepth: number of embeddings in the noun phrase with the longest chain of embeddings | ||
* sLength: sentence length = no. of words in the sentence (punctuation excluded) | * sLength: sentence length = no. of words in the sentence (punctuation excluded) | ||
- | * subRatio: subordination ratio = (no. of T-units + no. of clauses) / no. of T-units | + | * subRatio: subordination ratio = (no. of T-units + no. of clauses) / no. of T-units((T-unit is a main clause including all its embedded/ |
* maxTreeDepth: | * maxTreeDepth: | ||
* mdd: mean dependency distance: average number of word boundaries between words and their heads | * mdd: mean dependency distance: average number of word boundaries between words and their heads | ||
===== Measures for texts ===== | ===== Measures for texts ===== | ||
+ | |||
+ | The following measures are average values based on the measures for sentences. The mdd value is counted as the average for all words in the text. | ||
+ | |||
+ | * maxNPLengthAvg: | ||
+ | * maxNPDepthAvg: | ||
+ | * sLengthAvg: average sentence length = no. of words in the sentence (punctuation excluded) | ||
+ | * subRatioAvg: | ||
+ | * maxTreeDepthAvg: | ||
+ | * mdd: mean dependency distance: average number of word boundaries between words and their heads | ||
+ | |||
+ | In addition to syntactic complexity measures each text of sufficient length includes also two measures of [[en: |