AplikaceAplikace
Nastavení

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
en:cnk:intercorp:verze16 [2024/04/18 17:20] – [Morphosyntactic annotation] jankoceken:cnk:intercorp:verze16 [2025/01/17 18:20] (current) – [Known bugs] alexandrrosen
Line 173: Line 173:
 ^ Icelandic |  ✔  |  ✔  |  [[http://www.malfong.is/files/ot_tagset_files_en.pdf|in English]]    [[http://nlp.cs.ru.is/pdf/Tagset.pdf|in English]]  |  [[https://www.korpus.cz/kontext/wordlist/result?q=~OSQqSoscsiiG|list]]  | [[http://www.ling.su.se/english/nlp/tools/stagger/stagger-the-stockholm-tagger-1.98986|IceStagger]]  | ^ Icelandic |  ✔  |  ✔  |  [[http://www.malfong.is/files/ot_tagset_files_en.pdf|in English]]    [[http://nlp.cs.ru.is/pdf/Tagset.pdf|in English]]  |  [[https://www.korpus.cz/kontext/wordlist/result?q=~OSQqSoscsiiG|list]]  | [[http://www.ling.su.se/english/nlp/tools/stagger/stagger-the-stockholm-tagger-1.98986|IceStagger]]  |
 ^ Italian |  ✔  |  ✔  |  [[http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/data/italian-tagset.txt|in English]]        [[https://www.korpus.cz/kontext/wordlist/result?q=~AG82UCM6swiK|list]]  |[[http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/|TreeTagger]]  | ^ Italian |  ✔  |  ✔  |  [[http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/data/italian-tagset.txt|in English]]        [[https://www.korpus.cz/kontext/wordlist/result?q=~AG82UCM6swiK|list]]  |[[http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/|TreeTagger]]  |
-^ Japanese |  ✔  |  ✔  |  [[https://www.sketchengine.eu/tagset-jp-mecab/|in English]]        [[https://www.korpus.cz/kontext/wordlist/result?q=~v8EQwWqiygi|list]]  | [[https://taku910.github.io/mecab/|MeCab]] + [[https://unidic.ninjal.ac.jp|Unidic]]  |+^ Japanese |  ✔  |  ✔  |  [[https://www.sketchengine.eu/tagset-jp-mecab/|in English]]        [[https://www.korpus.cz/kontext/wordlist/result?q=~v8EQwWqiygis|list]]  | [[https://taku910.github.io/mecab/|MeCab]] + [[https://unidic.ninjal.ac.jp|Unidic]]  |
 ^ Latvian |  ✔  |  ✔  |   [[http://www.semti-kamols.lv/doc_upl/TagSet.html|in Latvian]]  |      [[https://www.korpus.cz/kontext/wordlist/result?q=~NiGIW6iec6eq|list]]  | [[https://peteris.rocks/blog/latvian-part-of-speech-tagging|LVTagger]]  | ^ Latvian |  ✔  |  ✔  |   [[http://www.semti-kamols.lv/doc_upl/TagSet.html|in Latvian]]  |      [[https://www.korpus.cz/kontext/wordlist/result?q=~NiGIW6iec6eq|list]]  | [[https://peteris.rocks/blog/latvian-part-of-speech-tagging|LVTagger]]  |
 ^ Norwegian |  ✔  |  ✔  |  [[http://universaldependencies.org/docs/u/pos/index.html|in English]]%%****%%)  |  [[https://universaldependencies.org/no/index.html#morphology|in English]]%%****%%)  |    [[https://www.korpus.cz/kontext/wordlist/result?q=~I6aemQOK8yiU|list]]  | [[https://web.archive.org/web/20170122231904/http://lindat.mff.cuni.cz/services/udpipe/api-reference.php|UDPipe]]  | ^ Norwegian |  ✔  |  ✔  |  [[http://universaldependencies.org/docs/u/pos/index.html|in English]]%%****%%)  |  [[https://universaldependencies.org/no/index.html#morphology|in English]]%%****%%)  |    [[https://www.korpus.cz/kontext/wordlist/result?q=~I6aemQOK8yiU|list]]  | [[https://web.archive.org/web/20170122231904/http://lindat.mff.cuni.cz/services/udpipe/api-reference.php|UDPipe]]  |
 ^ Polish |  ✔  |  ✔  |  [[http://nkjp.pl/poliqarp/help/ense2.html#x3-20002|in English]] and [[http://nkjp.pl/poliqarp/help/plse2.html#x3-20002|Polish]]  |  [[http://nlp.ipipan.waw.pl/%7Eadamp/Papers/2003-eacl-ws12/|in English]]  |  [[https://www.korpus.cz/kontext/wordlist/result?q=~ReKM6qg4Ic8W|list]]  |[[http://sgjp.pl/morfeusz/|Morfeusz]], [[https://github.com/kwrobel-nlp/krnnt|KRNNT]]  | ^ Polish |  ✔  |  ✔  |  [[http://nkjp.pl/poliqarp/help/ense2.html#x3-20002|in English]] and [[http://nkjp.pl/poliqarp/help/plse2.html#x3-20002|Polish]]  |  [[http://nlp.ipipan.waw.pl/%7Eadamp/Papers/2003-eacl-ws12/|in English]]  |  [[https://www.korpus.cz/kontext/wordlist/result?q=~ReKM6qg4Ic8W|list]]  |[[http://sgjp.pl/morfeusz/|Morfeusz]], [[https://github.com/kwrobel-nlp/krnnt|KRNNT]]  |
-^ Portuguese |  ✔  |  ✔  |  [[http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/data/Portuguese-Tagset.html|in Spanish]]  |      [[https://www.korpus.cz/kontext/wordlist/result?q=~saGaiAI0uEM|list]]  | [[http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/|TreeTagger]]  |+^ Portuguese |  ✔  |  ✔  |  [[http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/data/Portuguese-Tagset.html|in Spanish]]  |      [[https://www.korpus.cz/kontext/wordlist/result?q=~saGaiAI0uEMo|list]]  | [[http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/|TreeTagger]]  |
 ^ Russian |  ✔  |  ✔  |  [[http://corpus.leeds.ac.uk/mocky/ru-table.tab|in English]]  |  [[http://nl.ijs.si/ME/V4/msd/html/msd-ru.html|in English]] %%***%%)  |  [[https://www.korpus.cz/kontext/wordlist/result?q=~T2sc4y6Uw2WO|list]]  |[[http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/|TreeTagger]]  | ^ Russian |  ✔  |  ✔  |  [[http://corpus.leeds.ac.uk/mocky/ru-table.tab|in English]]  |  [[http://nl.ijs.si/ME/V4/msd/html/msd-ru.html|in English]] %%***%%)  |  [[https://www.korpus.cz/kontext/wordlist/result?q=~T2sc4y6Uw2WO|list]]  |[[http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/|TreeTagger]]  |
 ^ Slovak |  ✔  |  ✔  |  [[http://korpus.sk/morpho.html/|in Slovak]] and [[https://korpus.sk/morpho_en.html/|English]]  |  [[https://korpus.sk/attachments/morpho_en/tagset-www.pdf|in Slovak]]  |  [[https://www.korpus.cz/kontext/wordlist/result?q=~qkQQs4cq2IyG|list]]  | [[http://conference.ui.sav.sk/wikt2010/papers/01_garabik_f.pdf|Radovan Garabík, Morče]]  | ^ Slovak |  ✔  |  ✔  |  [[http://korpus.sk/morpho.html/|in Slovak]] and [[https://korpus.sk/morpho_en.html/|English]]  |  [[https://korpus.sk/attachments/morpho_en/tagset-www.pdf|in Slovak]]  |  [[https://www.korpus.cz/kontext/wordlist/result?q=~qkQQs4cq2IyG|list]]  | [[http://conference.ui.sav.sk/wikt2010/papers/01_garabik_f.pdf|Radovan Garabík, Morče]]  |
Line 278: Line 278:
   * [[https://taku910.github.io/mecab/|MeCab]] and [[https://osdn.net/projects/unidic/|Unidic]] for Japanese (thanks to Adam Nohejl)   * [[https://taku910.github.io/mecab/|MeCab]] and [[https://osdn.net/projects/unidic/|Unidic]] for Japanese (thanks to Adam Nohejl)
   * [[https://www.sutd.edu.sg/cmsresource/faculty/yuezhang/zpar.html|ZPar]] for Chinese (thanks to Vlastimil Dobečka)   * [[https://www.sutd.edu.sg/cmsresource/faculty/yuezhang/zpar.html|ZPar]] for Chinese (thanks to Vlastimil Dobečka)
 +
 +==== Known bugs ====
 +  * For some Finnish, Polish and Slovak texts in the Core part the value of the attribute **doc.id** is not displayed. This occurs when doc.id should be shown as metadata in references, structures and in document-based statistics. As a workaround, please use the **text.id** attribute instead. In the collections (Subtitles, Acquis, etc.) the doc.id attribute is shown as expected.
  
 ===== How to cite ===== ===== How to cite =====
Line 291: Line 294:
 When citing a specific part of InterCorp please use the reference displayed in KonText in the corpus description, e.g. as: When citing a specific part of InterCorp please use the reference displayed in KonText in the corpus description, e.g. as:
  
-Rosen, A., Vavřín, M., Zasina, A. J. (2022). //The InterCorp Corpus – Czech((Insert languages actually used.)), version 15 of 11 November 2022//. Institute of the Czech National Corpus, Charles University, Prague 2022. Available on-line: https://kontext.korpus.cz/+Rosen, A., Vavřín, M., Zasina, A. J. (2022). //The InterCorp Corpus – Czech((Insert languages actually used.)), version 16 of 11 November 2022//. Institute of the Czech National Corpus, Charles University, Prague 2022. Available on-line: https://kontext.korpus.cz/
  
 </WRAP> </WRAP>
Line 298: Line 301:
  
 <WRAP round box 51%> <WRAP round box 51%>
-[[en:cnk:intercorp|InterCorp]] • [[en:cnk:intercorp:verze15|Version 15]] • [[en:cnk:intercorp:verze14|Version 14]] • [[en:cnk:intercorp:verze13ud|Version 13ud]] • [[en:cnk:intercorp:verze13|Version 13]] • [[en:cnk:intercorp:verze12|Version 12]] • [[en:cnk:intercorp:verze11|Version 11]]  • [[en:cnk:intercorp:verze10|Version 10]] • [[en:cnk:intercorp:verze9|Version 9]] • [[en:cnk:intercorp:verze8|Version 8]] • [[en:cnk:intercorp:verze7|Version 7]] • [[en:cnk:intercorp:verze6|Version 6]] • [[en:cnk:intercorp:verze5|Version 5]] • [[en:cnk:intercorp:verze4|Verze 4]] • [[en:cnk:intercorp:verze3|Version 3]] • [[en:cnk:intercorp:historie|Version history]]+[[en:cnk:intercorp|InterCorp]] • [[en:cnk:intercorp:verze16ud|Version 16ud]] • [[en:cnk:intercorp:verze15|Version 15]] • [[en:cnk:intercorp:verze14|Version 14]] • [[en:cnk:intercorp:verze13ud|Version 13ud]] • [[en:cnk:intercorp:verze13|Version 13]] • [[en:cnk:intercorp:verze12|Version 12]] • [[en:cnk:intercorp:verze11|Version 11]]  • [[en:cnk:intercorp:verze10|Version 10]] • [[en:cnk:intercorp:verze9|Version 9]] • [[en:cnk:intercorp:verze8|Version 8]] • [[en:cnk:intercorp:verze7|Version 7]] • [[en:cnk:intercorp:verze6|Version 6]] • [[en:cnk:intercorp:verze5|Version 5]] • [[en:cnk:intercorp:verze4|Verze 4]] • [[en:cnk:intercorp:verze3|Version 3]] • [[en:cnk:intercorp:historie|Version history]]
  
 See [[https://intercorp.korpus.cz/?lang=en|the original InterCorp site in English]]. See [[https://intercorp.korpus.cz/?lang=en|the original InterCorp site in English]].
 </WRAP> </WRAP>