Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision |
en:cnk:intercorp:verze10 [2017/12/01 14:17] – [Morphosyntactic annotation] alexandrrosen | en:cnk:intercorp:verze10 [2017/12/15 21:15] – [InterCorp Release 10] alexandrrosen |
---|
====== InterCorp Release 10 ====== | ====== InterCorp Release 10 ====== |
| |
| |
| |
<WRAP right> | |
^ Name ^^ Czech -- core ^ Czech -- collections ^ other -- core ^ other -- collections ^ | ^ Name ^^ Czech -- core ^ Czech -- collections ^ other -- core ^ other -- collections ^ |
^ Positions ^ Number of tokens | 127,413,531 | 118,069,703 | 311,809,130 | 1,551,411,225 | | ^ Positions ^ Number of tokens | 127,413,531 | 118,069,703 | 311,809,130 | 1,551,411,225 | |
^ ::: ^ tagged languages | 23 ^^^^ | ^ ::: ^ tagged languages | 23 ^^^^ |
^ ::: ^ lemmatized languages | 22 ^^^^ | ^ ::: ^ lemmatized languages | 22 ^^^^ |
</WRAP> | |
| |
| |
===== Access to the texts ===== | ===== Access to the texts ===== |
^ Catalan | ✔ | ✔ | [[http://clic.ub.edu/corpus/webfm_send/18|in English]] | | [[http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/|TreeTagger]] | | ^ Catalan | ✔ | ✔ | [[http://clic.ub.edu/corpus/webfm_send/18|in English]] | | [[http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/|TreeTagger]] | |
^ Croatian | ✔ | ✔ | [[https://github.com/ffnlp/sethr/blob/master/mte4r-upos.mapping|in English]] | | [[https://github.com/uzh/reldi|ReLDI Tagger]] | | ^ Croatian | ✔ | ✔ | [[https://github.com/ffnlp/sethr/blob/master/mte4r-upos.mapping|in English]] | | [[https://github.com/uzh/reldi|ReLDI Tagger]] | |
^ Czech | ✔ | ✔ | [[http://wiki.korpus.cz/doku.php/seznamy:tagy|in Czech]] and [[http://ufal.mff.cuni.cz/pdt/Morphology_and_Tagging/Doc/hmptagqr.html|in English]] | [[http://ufal.mff.cuni.cz/pdt/Morphology_and_Tagging/Doc/docc0pos.pdf|in English]] | [[http://ufal.mff.cuni.cz/morce/index.php|Morče]] | | ^ Czech | ✔ | ✔ | [[http://wiki.korpus.cz/doku.php/seznamy:tagy|in Czech]] and [[http://ufal.mff.cuni.cz/pdt/Morphology_and_Tagging/Doc/hmptagqr.html|English]] | [[http://ufal.mff.cuni.cz/pdt/Morphology_and_Tagging/Doc/docc0pos.pdf|in English]] | [[http://ufal.mff.cuni.cz/morce/index.php|Morče]] | |
^ Dutch | ✔ | ✔ | [[http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/data/dutch-tagset.txt|in English]] | [[http://www.inl.nl/tst-centrale/images/stories/producten/documentatie/ehc_handleiding_nl.pdf|in Dutch]] | [[http://www.ims.uni-stuttgart.de/forschung/ressourcen/werkzeuge/treetagger.html|TreeTagger]] | | ^ Dutch | ✔ | ✔ | [[http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/data/dutch-tagset.txt|in English]] | [[http://www.inl.nl/tst-centrale/images/stories/producten/documentatie/ehc_handleiding_nl.pdf|in Dutch]] | [[http://www.ims.uni-stuttgart.de/forschung/ressourcen/werkzeuge/treetagger.html|TreeTagger]] | |
^ English | ✔ | ✔ | [[https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html|in English]] | [[http://utkl.ff.cuni.cz/%7Erosen/public/Penn-Treebank-Tagset.pdf|in English]] + [[http://utkl.ff.cuni.cz/%7Erosen/public/PennTagAdd.html|additions]] | [[http://www.ims.uni-stuttgart.de/forschung/ressourcen/werkzeuge/treetagger.html|TreeTagger]] | | ^ English | ✔ | ✔ | [[https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html|in English]] | [[http://utkl.ff.cuni.cz/%7Erosen/public/Penn-Treebank-Tagset.pdf|in English]] + [[http://utkl.ff.cuni.cz/%7Erosen/public/PennTagAdd.html|additions]] | [[http://www.ims.uni-stuttgart.de/forschung/ressourcen/werkzeuge/treetagger.html|TreeTagger]] | |
^ Estonian | ✔ | ✔ | [[http://www.cl.ut.ee/korpused/morfliides/seletus| in Estonian and English]] | | [[http://www.ims.uni-stuttgart.de/forschung/ressourcen/werkzeuge/treetagger.html|TreeTagger]] | | ^ Estonian | ✔ | ✔ | [[http://www.cl.ut.ee/korpused/morfliides/seletus| in Estonian and English]] | | [[http://www.ims.uni-stuttgart.de/forschung/ressourcen/werkzeuge/treetagger.html|TreeTagger]] | |
^ Finnish | ✔ | ✔ | [[https://www.sketchengine.co.uk/finntreebank/|OMorFi]] | [[http://www.ling.helsinki.fi/kieliteknologia/tutkimus/treebank/sources/FinnTreeBankManual.pdf|in English]]%%*%%) | +[[https://code.google.com/archive/p/hunpos/|HunPOS]] | | ^ Finnish | ✔ | ✔ | [[https://www.sketchengine.co.uk/finntreebank/|in English]]%%*%%) | [[http://www.ling.helsinki.fi/kieliteknologia/tutkimus/treebank/sources/FinnTreeBankManual.pdf|in English]]%%*%%) | [[http://www.ling.helsinki.fi/kieliteknologia/tutkimus/omor/omorfi/README.shtml|OMorFi]] +[[https://code.google.com/archive/p/hunpos/|HunPOS]] | |
^ French | ✔ | ✔ | [[http://www.ims.uni-stuttgart.de/%7Eschmid/french-tagset.html|in English]] | | [[http://www.ims.uni-stuttgart.de/forschung/ressourcen/werkzeuge/treetagger.html|TreeTagger]] | | ^ French | ✔ | ✔ | [[http://www.ims.uni-stuttgart.de/%7Eschmid/french-tagset.html|in English]] | | [[http://www.ims.uni-stuttgart.de/forschung/ressourcen/werkzeuge/treetagger.html|TreeTagger]] | |
^ German | ✔ | ✔ | [[https://www.sketchengine.co.uk/German-rftagger-part-of-speech-tagset/|in English]]%%**%% | [[http://utkl.ff.cuni.cz/%7Erosen/public/stts_guide.pdf|in German]] | [[http://www.cis.uni-muenchen.de/~schmid/tools/RFTagger/|RFTagger]] | | ^ German | ✔ | ✔ | [[https://www.sketchengine.co.uk/German-rftagger-part-of-speech-tagset/|in English]]%%**%% | [[http://utkl.ff.cuni.cz/%7Erosen/public/stts_guide.pdf|in German]] | [[http://www.cis.uni-muenchen.de/~schmid/tools/RFTagger/|RFTagger]] | |