Both sides previous revisionPrevious revisionNext revision | Previous revision |
en:cnk:intercorp:verze12 [2019/12/20 00:28] – [Morphosyntactic annotation] alexandrrosen | en:cnk:intercorp:verze12 [2020/02/12 21:12] (current) – [InterCorp Release 12] alexandrrosen |
---|
^ ::: ^ publication date | 2019 ^^^^ | ^ ::: ^ publication date | 2019 ^^^^ |
^ ::: ^ foreign languages | 40 ^^^^ | ^ ::: ^ foreign languages | 40 ^^^^ |
^ ::: ^ tagged languages | 26 ^^^^ | ^ ::: ^ tagged languages | 27 ^^^^ |
^ ::: ^ lemmatized languages | 25 ^^^^ | ^ ::: ^ lemmatized languages | 25 ^^^^ |
| |
| zh | Chinese | 0 | 240 | 0 | 0 | 0 | 2,247 | 0 | 2,487 | | | zh | Chinese | 0 | 240 | 0 | 0 | 0 | 2,247 | 0 | 2,487 | |
| **Subtotal** | | 303,772 | 27,616 | 24,658 | 406,459 | 263,864 | 489,170 | 11,102 | 1,526,633 | | | **Subtotal** | | 303,772 | 27,616 | 24,658 | 406,459 | 263,864 | 489,170 | 11,102 | 1,526,633 | |
| cs | Czech | 110,573 | 4,351 | 2,310 | 19,085 | 12,908 | 50,604 | 562 | 200,393 | | | cs | Czech | 110,573 | 4,351 | 2,310 | 19,085 | 12,908 | 50,604 | 562 | 200,393 | |
| **TOTAL** | | 414,345 | 31,967 | 26,968 | 425,543 | 276,772 | 539,774 | 11,664 | 1,727,026 | | | **TOTAL** | | 414,345 | 31,967 | 26,968 | 425,543 | 276,772 | 539,774 | 11,664 | 1,727,026 | |
| |
| |text.volume|volume number|number| | | |text.volume|volume number|number| |
| |text.pages|number of pages|number| | | |text.pages|number of pages|number| |
| |text.lang_var|language variety|de-AT / de-CH / de-DE / en-AU / en-CA / en-GB / en-UM / en-US / es-ES / es-MX / es-PE / fr-BE / fr-FR / it-CH / it-IT / nl-BE / nl-NL / pt-BR / pt-PT / sr-Latn-RS / sy-Cyrl-RS| | | |text.lang_var|language variety|de-AT / de-CH / de-DE / en-AU / en-CA / en-GB / en-UM / en-US / es-ES / es-MX / es-PE / fr-BE / fr-FR / it-CH / it-IT / nl-BE / nl-NL / pt-BR / pt-PT / sr-RS | |
| |text.wordcount|number of words|number| | | |text.wordcount|number of words|number| |
|div|div.id|division identifier (Bible)| _NT / _OT:chapter | | |div|div.id|division identifier (Bible)| _NT / _OT:chapter | |
* [[http://ufal.mff.cuni.cz/morfflex|MorfFlex]], [[http://ufal.mff.cuni.cz/morce/index.php|Morče]] and [[https://is.cuni.cz/webapps/zzp/download/140018093/?back_id=10|LanGr]] for Czech | * [[http://ufal.mff.cuni.cz/morfflex|MorfFlex]], [[http://ufal.mff.cuni.cz/morce/index.php|Morče]] and [[https://is.cuni.cz/webapps/zzp/download/140018093/?back_id=10|LanGr]] for Czech |
* [[http://www.ims.uni-stuttgart.de/forschung/ressourcen/werkzeuge/treetagger.html|TreeTagger]] for Bulgarian, Dutch, English, Estonian (thanks to Helmut Schmid), French, Italian, Portuguese (thanks to Pablo Gamallo), Russian and Spanish | * [[http://www.ims.uni-stuttgart.de/forschung/ressourcen/werkzeuge/treetagger.html|TreeTagger]] for Bulgarian, Dutch, English, Estonian (thanks to Helmut Schmid), French, Italian, Portuguese (thanks to Pablo Gamallo), Russian and Spanish |
* [[http://sgjp.pl/morfeusz/|Morfeusz]] and [[http://nlp.pwr.wroc.pl/takipi/|TaKIPI]] for Polish | * [[http://sgjp.pl/morfeusz/|Morfeusz]] and [[https://github.com/kwrobel-nlp/krnnt|KRNNT]] for Polish |
* [[http://code.google.com/p/hunpos/|HunPOS]] for Hungarian and other languages | * [[http://code.google.com/p/hunpos/|HunPOS]] for Hungarian and other languages |
* [[http://conference.ui.sav.sk/wikt2010/papers/01_garabik_f.pdf|Tagger for Slovak]] (thanks to Radovan Garabík) | * [[http://conference.ui.sav.sk/wikt2010/papers/01_garabik_f.pdf|Tagger for Slovak]] (thanks to Radovan Garabík) |
* [[https://peteris.rocks/blog/latvian-part-of-speech-tagging/|LVTagger]] for Latvian (thanks to Pēteris Paikens and Michal Škrabal) | * [[https://peteris.rocks/blog/latvian-part-of-speech-tagging/|LVTagger]] for Latvian (thanks to Pēteris Paikens and Michal Škrabal) |
* [[http://ufal.mff.cuni.cz/udpipe|UD Pipe]] for Belarusian and Ukrainian (thanks to Bohdan Moskalevskyi) | * [[http://ufal.mff.cuni.cz/udpipe|UD Pipe]] for Belarusian and Ukrainian (thanks to Bohdan Moskalevskyi) |
* [[https://taku910.github.io/mecab/|MeCab]] and [[https://osdn.net/projects/unidic/|Unidic]] for Japanese | * [[https://taku910.github.io/mecab/|MeCab]] and [[https://osdn.net/projects/unidic/|Unidic]] for Japanese (thanks to Adam Nohejl) |
| * [[https://www.sutd.edu.sg/cmsresource/faculty/yuezhang/zpar.html|ZPar]] for Chinese (thanks to Vlastimil Dobečka) |
| |
| |