Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision | ||
en:cnk:fictree [2017/12/15 15:19] – tomasjelinek | en:cnk:fictree [2017/12/18 09:35] – [The composition of the FicTree treebank] luciechlumska | ||
---|---|---|---|
Line 14: | Line 14: | ||
The FicTree treebank consists of eight literary works published in the Czech Republic between 1991 and 2007. The texts in the treebank include six fiction titles, a children’s fiction | The FicTree treebank consists of eight literary works published in the Czech Republic between 1991 and 2007. The texts in the treebank include six fiction titles, a children’s fiction | ||
- | Most of the texts were first published between 1991 and 2007 except one text, published in 1969. | + | Most of the texts were first published between 1991 and 2007 except |
Five texts (80% of all tokens) are original Czech texts, the other three are translations (from German and Slovak). | Five texts (80% of all tokens) are original Czech texts, the other three are translations (from German and Slovak). | ||
Line 32: | Line 32: | ||
The FicTree corpus is available in the same way as other CNC corpora through the [[en: | The FicTree corpus is available in the same way as other CNC corpora through the [[en: | ||
- | The corpus annotation is accessible through a wide range of attributes of each token. The morphologic and annotation and lemmatization is available using the attributes [[seznamy: | + | The corpus annotation is accessible through a wide range of attributes of each token. The morphological |
The syntactic annotation of FicTree can be accessed using several positional attributes (the same as in the corpus SYN2015): | The syntactic annotation of FicTree can be accessed using several positional attributes (the same as in the corpus SYN2015): | ||
Line 39: | Line 39: | ||
* eparent – relative position of the nearest governing content word | * eparent – relative position of the nearest governing content word | ||
* prep – lemma of a preposition governing the token (if any) | * prep – lemma of a preposition governing the token (if any) | ||
- | * p_lemma, p_tag, ep_lemma, ep_tag – tag a lemma of the governing token | + | * p_lemma, p_tag, ep_lemma, ep_tag – tag and lemma of the governing token |
* p_pos, p_case, ep_pos, ep_case – POS and case of the governing token | * p_pos, p_case, ep_pos, ep_case – POS and case of the governing token | ||
* p_afun, ep_afun – syntactic function of the governing token | * p_afun, ep_afun – syntactic function of the governing token |