Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision |
en:cnk:intercorp:verze11 [2018/10/02 21:05] – [Structural attributes] alexandrrosen | en:cnk:intercorp:verze11 [2019/09/16 21:08] – [Morphosyntactic annotation] alexandrrosen |
---|
^ ::: ^ Number of word forms | 106,898,538 | 88,872,779 | 283,075,338 | 1,225,361,750 | | ^ ::: ^ Number of word forms | 106,898,538 | 88,872,779 | 283,075,338 | 1,225,361,750 | |
^ Structural attributes ^ Number of documents | 1,564 | 28 | 3,494 | 261 | | ^ Structural attributes ^ Number of documents | 1,564 | 28 | 3,494 | 261 | |
^ ::: ^ Number of text | 1,507 | 111,672 | 3,232 | 1,841,341 | | ^ ::: ^ Number of texts | 1,507 | 111,672 | 3,232 | 1,841,341 | |
^ ::: ^ Number of sentences | 9,193,433 | 13,556,382 | 21,000,997 | 142,734,659 | | ^ ::: ^ Number of sentences | 9,193,433 | 13,556,382 | 21,000,997 | 142,734,659 | |
^ Further information ^ reference | YES ^^^^ | ^ Further information ^ reference | YES ^^^^ |
When citing a specific part of InterCorp please use the reference displayed in KonText in the corpus description, e.g. as: | When citing a specific part of InterCorp please use the reference displayed in KonText in the corpus description, e.g. as: |
| |
Rosen, A., Vavřín, M., Zasina, A. J. (2018) //The InterCorp Corpus – Czech((Insert actually used languages.)), version 11 of 5 October 2018//. Institute of the Czech National Corpus, Charles University, Prague 2018. Available on-line: http://www.korpus.cz | Rosen, A., Vavřín, M., Zasina, A. J. (2018). //The InterCorp Corpus – Czech((Insert actually used languages.)), version 11 of 19 October 2018//. Institute of the Czech National Corpus, Charles University, Prague 2018. Available on-line: http://www.korpus.cz |
| |
</WRAP> | </WRAP> |
^ Language ^ Tags ^ Lemmas ^ Brief description ^ Detailed description ^ Tool ^ | ^ Language ^ Tags ^ Lemmas ^ Brief description ^ Detailed description ^ Tool ^ |
^ Belarusian | ✔ | ✔ | | [[http://universaldependencies.org/docs/u/pos/index.html|in English]]%%****%%) | [[https://web.archive.org/web/20170122231904/http://lindat.mff.cuni.cz/services/udpipe/api-reference.php|UDPipe]] | | ^ Belarusian | ✔ | ✔ | | [[http://universaldependencies.org/docs/u/pos/index.html|in English]]%%****%%) | [[https://web.archive.org/web/20170122231904/http://lindat.mff.cuni.cz/services/udpipe/api-reference.php|UDPipe]] | |
^ Bulgarian | ✔ | ✔ | | [[http://www.bultreebank.org/TechRep/BTB-TR03.pdf|in English]] | [[http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/|TreeTagger]] | | ^ Bulgarian | ✔ | ✔ | | [[http://bultreebank.org/en/resources/short-description-dependency-part-bultreebank-bultreebank-dp/btb-tr03-2/|in English]] | [[http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/|TreeTagger]] | |
^ Catalan | ✔ | ✔ | [[http://clic.ub.edu/corpus/webfm_send/18|in English]] | | [[http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/|TreeTagger]] | | ^ Catalan | ✔ | ✔ | [[http://clic.ub.edu/corpus/webfm_send/18|in English]] | | [[http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/|TreeTagger]] | |
^ Croatian | ✔ | ✔ | [[https://github.com/ffnlp/sethr/blob/master/mte4r-upos.mapping|in English]] | | [[https://github.com/uzh/reldi|ReLDI Tagger]] | | ^ Croatian | ✔ | ✔ | [[https://github.com/ffnlp/sethr/blob/master/mte4r-upos.mapping|in English]] | | [[https://github.com/uzh/reldi|ReLDI Tagger]] | |
^ French | ✔ | ✔ | [[http://www.ims.uni-stuttgart.de/%7Eschmid/french-tagset.html|in English]] | | [[http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/|TreeTagger]] | | ^ French | ✔ | ✔ | [[http://www.ims.uni-stuttgart.de/%7Eschmid/french-tagset.html|in English]] | | [[http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/|TreeTagger]] | |
^ German | ✔ | ✔ | [[https://www.sketchengine.co.uk/German-rftagger-part-of-speech-tagset/|in English]]%%**%% | [[http://utkl.ff.cuni.cz/%7Erosen/public/stts_guide.pdf|in German]] | [[http://www.cis.uni-muenchen.de/~schmid/tools/RFTagger/|RFTagger]] | | ^ German | ✔ | ✔ | [[https://www.sketchengine.co.uk/German-rftagger-part-of-speech-tagset/|in English]]%%**%% | [[http://utkl.ff.cuni.cz/%7Erosen/public/stts_guide.pdf|in German]] | [[http://www.cis.uni-muenchen.de/~schmid/tools/RFTagger/|RFTagger]] | |
^ Hungarian | ✔ | | [[http://nl.ijs.si/ME/Vault/V3/msd/html/msd.html#SECTION05400000000000000000|in English]] | [[http://www.cis.uni-muenchen.de/~schmid/tools/RFTagger/|RFTagger]] | | ^ Hungarian | ✔ | | [[http://nl.ijs.si/ME/Vault/V3/msd/html/msd.html#SECTION05400000000000000000|in English]] | |[[http://www.cis.uni-muenchen.de/~schmid/tools/RFTagger/|RFTagger]] | |
^ Icelandic | ✔ | ✔ | [[http://www.malfong.is/files/ot_tagset_files_en.pdf|in English]] | | [[http://www.ling.su.se/english/nlp/tools/stagger/stagger-the-stockholm-tagger-1.98986|IceStagger]] | | ^ Icelandic | ✔ | ✔ | [[http://www.malfong.is/files/ot_tagset_files_en.pdf|in English]] | | [[http://www.ling.su.se/english/nlp/tools/stagger/stagger-the-stockholm-tagger-1.98986|IceStagger]] | |
^ Italian | ✔ | ✔ | [[http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/data/italian-tagset.txt|in English]] | | [[http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/|TreeTagger]] | | ^ Italian | ✔ | ✔ | [[http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/data/italian-tagset.txt|in English]] | | [[http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/|TreeTagger]] | |
^Structure^Attribute^Description^Values^ | ^Structure^Attribute^Description^Values^ |
|doc|doc.id|document identifier| author's_last_name-shortened_title / _ACQUIS / _EUROPARL / _PRESSEUROP_year / _SUBTITLES / _SYNDICATE_year / _OT / _NT | | |doc|doc.id|document identifier| author's_last_name-shortened_title / _ACQUIS / _EUROPARL / _PRESSEUROP_year / _SUBTITLES / _SYNDICATE_year / _OT / _NT | |
|text|text.id|text identifier|author's_last_name-shortened_title:0 / _ACQUIS:number / _EUROPARL:number / _PRESSEUROP:number / _SUBTITLES:number / _SYNDICATE:name / _OT:book / _NT:book | | |text|text.id|text identifier|author's_last_name-shortened_title:0 / _ACQUIS:number / _EUROPARL:number / _PRESSEUROP:number / _SUBTITLES:number / _SYNDICATE_year:name / _OT:book / _NT:book | |
| |text.author|author|last name, first name| | | |text.author|author|last name, first name| |
| |text.title|full title|text| | | |text.title|full title|text| |
| |text.volume|volume number|number| | | |text.volume|volume number|number| |
| |text.pages|number of pages|number| | | |text.pages|number of pages|number| |
| |text.lang_var|language variety|text| | | |text.lang_var|language variety|de-AT / de-CH / de-DE / en-AU / en-CA / en-GB / en-UM / en-US / es-ES / es-MX / es-PE / fr-BE / fr-FR / it-CH / it-IT / nl-BE / nl-NL / pt-BR / pt-PT / sr-Latn-RS / sy-Cyrl-RS| |
| |text.wordcount|number of words|number| | | |text.wordcount|number of words|number| |
|div|div.id|unique division identifier (used in the Bible)| _NT / _OT:chapter | | |div|div.id|division identifier (Bible)| _NT / _OT:chapter | |
| |div.type|division type|chapter| | | |div.type|division type|chapter| |
|p|p.id|unique paragraph identifier|doc:text:div:par| | |p|p.id|paragraph identifier|doc:text:div:par| |
|s|s.id|unique sentence identifier|doc:text:div:par:sent| | |s|s.id|sentence identifier|doc:text:div:par:sent| |
|hi|hi.rend|typeface|italic / bold / bold italic| | |hi|hi.rend|typeface|italic / bold / bold italic| |
|lb|lb.id|verse beginning identifier (in the Bible)|book:chapter:verse| | |lb|lb.id|verse identifier (Bible)|book:chapter:verse| |
===== Acknowledgements ===== | ===== Acknowledgements ===== |
| |