Obě strany předchozí revizePředchozí verzeNásledující verze | Předchozí verze |
seznamy:tagery [2015/01/31 15:48] – alexandrrosen | seznamy:tagery [2022/09/29 14:17] (aktuální) – jankrivan |
---|
^ ^[[ http://staffwww.dcs.shef.ac.uk/people/A.Aker/activityNLPProjects.html|Ahmet Akker]] (tool)^[[ http://clcl.unige.ch/btag/|BTTtagger]] (tool)^[[ http://ufal.mff.cuni.cz/compost/|COMPOST]] (tool)^[[http://nlp.lsi.upc.edu/freeling/|Freeling]] (tool)^[[ https://tech.yandex.ru/mystem/|MYSTEM]] (tool)^[[ http://www.nltk.org|NLTK]] (tool)^[[ http://rdrpostagger.sourceforge.net|RDRPOSTagger]] (tool)^[[ http://www.cis.uni-muenchen.de/~schmid/tools/RFTagger/|RFTagger]] (tool)^[[ http://nlp.stanford.edu/software/tagger.shtml|Stanford]] (tool)^[[ http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/|Treetagger]] (tool)^Other ^ | ====== Taggers & lemmatizers ====== |
^Arabic| | | | | | | | | x | |[[ http://nlp.ldeo.columbia.edu/madamira/|Madamira]] (web service, tool)| | |
^Asturian| | | | x | | | | | | | | | |
^Belarusian| | | | | x | | | | | | | | |
^Bengali| | | | | | x | | | | | | | |
^Bulgarian| | | | | | | x | | | __x__ |[[ http://dcl.bas.bg/dclservices/registration/index.php|DCL]] (tool)| | |
^Catalan| | | | x | | x | | | | | | | |
^Chinese| | | | | | | | | x | | | | |
^Croatian| | x | | | | | | | | |[[ http://nlp.ffzg.hr/resources/models/tagging/|Nikola Ljubešić]] (tool)| | |
^Czech| | x | x | | | | x | x | | |[[ http://ufal.mff.cuni.cz/morphodita|MorphoDiTa]] (tool)| | |
^Danish| | | | | | | x | | | |[[ https://mlnl.net/jg/software/bnl/|CST]] (web, tool)| | |
^Dutch| x | | x | | | x | x | | | __x__ |[[ https://mlnl.net/jg/software/bnl/|Brill-NL]] (web, tool)| | |
^English| x | x | x | x | | | | | | __x__ |[[ http://ufal.mff.cuni.cz/morphodita|MorphoDiTa]] (tool)| | |
^Estonian| | x | | | | | | | | __x__ | | | |
^Finnish| | | | | | | | | | x |[[ https://github.com/TurkuNLP/Finnish-dep-parser|OMorFi]] (tool)| | |
^French| x | x | | x | | | x | | | __x__ | | | |
^Galician| | | | x | | | | | | x | | | |
^German| x | | | | | | x | x | | __x__ | | | |
^Greek| | | | | | | | | | |[[ http://nlp.ilsp.gr/ws/|ILSP]] (web)| | |
^Hebrew| | | | | | | | | | |[[ http://www.mila.cs.technion.ac.il|MILA]] (tool)| | |
^Hindi| | | | | | x | x | | | |[[ http://sivareddy.in/downloads#indian_language_tools|Siva Reddy]] (tool), [[ http://ltrc.iiit.ac.in/analyzer/hindi/|Hindi Shallow Parser]] | | |
^Hungarian| | x | | | | | | x | | |__[[ http://code.google.com/p/hunpos/|hunpos]]__ (tool)| | |
^Icelandic| | | x | | | | | | | |__[[ http://www.ling.su.se/english/nlp/tools/stagger/stagger-the-stockholm-tagger-1.98986|IceStagger]]__ (tool)| | |
^Indonesian| | | | | | x | | | | | | | |
^Italian| x | x | | x | | | x | | | __x__ | | | |
^Japanese| | | | | | x | | | | |[[ https://code.google.com/p/mecab/|mecab]] (tool)| | |
^Lao| | | | | | | x | | | | | | |
^Macedonian| | x | | | | | | | | | | | |
^Maltese| | | | | | | | | | |[[ http://mlrs.research.um.edu.mt|Maltese Language Resource Server]] (web)| | |
^Malay| | | | | | | | | | | | | |
^Marathi| | | | | | x | | | | | | | |
^Mongolian| | | | | | | | | | x | | | |
^Norwegian| | | | | | | | | | |__[[ http://tekstlab.uio.no/obt-ny/index.html|obt]]__ (tool)| | |
^Persian| | | | | | | | | | |[[ https://mlnl.net/jg/software/bnl/|hazm]] (tool)| | |
^Polish| | x | | | | x | | | | x |__[[ http://nlp.pwr.wroc.pl/takipi/|TaKIPI]]__ (tool), [[ http://zil.ipipan.waw.pl/PANTERA|Pantera]] (tool)| | |
^Portuguese| | | | x | | x | x | | | __x__ | | | |
^Romanian| | x | | | | | | | | |[[ http://www.racai.ro/tools/text/|RACAI]] (web)| | |
^Russian| | | | x | x | | | x | | __x__ | | | |
^Serbian| | x | | | | | | | | |[[ http://nlp.ffzg.hr/resources/models/tagging/|Nikola Ljubešić]]| | |
^Slovak| | | | | | | | x | | x |__[[ http://ufal.mff.cuni.cz/morce/index.php|Morče]]__ (tool)| | |
^Slovene| | x | | | | | | x | | |__[[ http://nl.ijs.si/analyse/|ToTaLe]]__ (tool)| | |
^Spanish| x | | | x | | x | x | | | __x__ | | | |
^Swahili| | | | | | | | | | x | | | |
^Swedish| | | x | | | | x | | | |__[[ http://www.ling.su.se/english/nlp/tools/stagger/stagger-the-stockholm-tagger-1.98986|Stagger]]__ (tool)| | |
^Telugu| | | | | | x | | | | | | | |
^Thai| | | | | | | x | | | | | | |
^Turkish| | | | | | | | | | |[[ http://tools.nlp.itu.edu.tr|ITU Turkish Natural Language Processing Pipeline]] (web), [[ http://coltekin.net/cagri/trmorph/|Trmorph]] (tool, MA)| | |
^Ukrainian| | | | | x | | | | | |[[ http://ugtag.sourceforge.net|ugtag]] (tool)| | |
^Vietnamese| | | | | | | x | | | |[[ http://mim.hus.vnu.edu.vn/phuonglh/softwares|vnTagger]] (tool), [[ http://vlsp.vietlp.org:8080/demo/?&lang=en|Vietnamese Language and Speech Processing (VLSP) / VietTagger]]| | |
^Welsh| | | | x | | | | | | | | | |
| |
| Notice: The list is not kept up to date (last update 2/2015). |
| |
N.B.: | ^ ^[[ http://staffwww.dcs.shef.ac.uk/people/A.Aker/activityNLPProjects.html|Ahmet Akker]] (tool)^[[ https://svn.code.sf.net/p/apertium/svn/languages/|Apertium]] (tool)^[[ http://clcl.unige.ch/btag/|BTTtagger]] (tool)^[[ http://ufal.mff.cuni.cz/compost/|COMPOST]] (tool)^[[http://nlp.lsi.upc.edu/freeling/|Freeling]] (tool)^[[ https://tech.yandex.ru/mystem/|MYSTEM]] (tool)^[[ http://www.nltk.org|NLTK]] (tool)^[[ http://rdrpostagger.sourceforge.net|RDRPOSTagger]] (tool)^[[ http://www.cis.uni-muenchen.de/~schmid/tools/RFTagger/|RFTagger]] (tool)^[[ http://nlp.stanford.edu/software/tagger.shtml|Stanford]] (tool)^[[ http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/|Treetagger]] (tool)^Other ^ |
| ^Arabic| | x | | | | | | | | x | |[[ http://nlp.ldeo.columbia.edu/madamira/|Madamira]] (web, tool)| |
| ^Asturian| | x | | | x | | | | | | | | |
| ^Belarusian| | | | | | x | | | | | | | |
| ^Bengali| | | | | | | x | | | | | | |
| ^Bulgarian| | x | | | | | | x | | | __x__ |[[ http://dcl.bas.bg/dclservices/registration/index.php|DCL]] (tool)| |
| ^Catalan| | x | | | x | | x | | | | | | |
| ^Chinese| | | | | | | | | | x | | | |
| ^Croatian| | x | x | | | | | | | | |[[ http://nlp.ffzg.hr/resources/models/tagging/|Nikola Ljubešić]] (tool)| |
| ^Czech| | | x | x | | | | x | x | | |[[ http://ufal.mff.cuni.cz/morphodita|MorphoDiTa]] (tool)| |
| ^Danish| | x | | | | | | x | | | |[[ https://mlnl.net/jg/software/bnl/|CST]] (web, tool)| |
| ^Dutch| x | x | | x | | | x | x | | | __x__ |[[ https://mlnl.net/jg/software/bnl/|Brill-NL]] (web, tool), [[http://ilk.uvt.nl/frog/|Frog]] (tool)| |
| ^English| x | | x | x | x | | | | | x | __x__ |[[ http://ufal.mff.cuni.cz/morphodita|MorphoDiTa]] (tool)| |
| ^Estonian| | | x | | | | | | | | __x__ | | |
| ^Finnish| | | | | | | | | | | x |[[ https://github.com/TurkuNLP/Finnish-dep-parser|OMorFi]] (tool)| |
| ^French| x | x | x | | x | | | x | | x | __x__ | | |
| ^Galician| | x | | | x | | | | | | x | | |
| ^German| x | | | | | | | x | __x__ | x | x | | |
| ^Greek| | | | | | | | | | | |[[ http://nlp.ilsp.gr/ws/|ILSP]] (web)| |
| ^Hebrew| | x | | | | | | | | | |[[ http://www.mila.cs.technion.ac.il|MILA]] (tool)| |
| ^Hindi| | x | | | | | x | x | | | |[[ http://sivareddy.in/downloads#indian_language_tools|Siva Reddy]] (tool), [[ http://ltrc.iiit.ac.in/analyzer/hindi/|Hindi Shallow Parser]] (web)| |
| ^Hungarian| | | x | | | | | | x | | |__[[ http://code.google.com/p/hunpos/|hunpos]]__ (tool)| |
| ^Icelandic| | x | | x | | | | | | | |__[[ http://www.ling.su.se/english/nlp/tools/stagger/stagger-the-stockholm-tagger-1.98986|IceStagger]]__ (tool)| |
| ^Indonesian| | | | | | | x | | | | | | |
| ^Italian| x | x | x | | x | | | x | | | __x__ | | |
| ^Japanese| | | | | | | x | | | | |[[ https://code.google.com/p/mecab/|mecab]] (tool)| |
| ^Lao| | | | | | | | x | | | | | |
| ^Macedonian| | x | x | | | | | | | | | | |
| ^Maltese| | x | | | | | | | | | |[[ http://mlrs.research.um.edu.mt|Maltese Language Resource Server]] (web)| |
| ^Malay| | | | | | | | | | | | | |
| ^Marathi| | | | | | | x | | | | | | |
| ^Mongolian| | | | | | | | | | | x | | |
| ^Norwegian| | x | | | | | | | | | |__[[ http://tekstlab.uio.no/obt-ny/index.html|obt]]__ (tool)| |
| ^Persian| | | | | | | | | | | |[[ https://mlnl.net/jg/software/bnl/|hazm]] (tool)| |
| ^Polish| | | x | | | | x | | | | x |__[[ http://nlp.pwr.wroc.pl/takipi/|TaKIPI]]__, [[ http://zil.ipipan.waw.pl/PANTERA|Pantera]], [[http://zil.ipipan.waw.pl/Concraft|Concraft]], [[http://nlp.pwr.wroc.pl/redmine/projects/wcrft/wiki|WCRFT]] (tools)((For more tools see [[http://clip.ipipan.waw.pl/LRT|Language Tools and Resources for Polish]].)) | |
| ^Portuguese| | x | | | x | | x | x | | | __x__ | | |
| ^Romanian| | x | x | | | | | | | | |[[ http://www.racai.ro/tools/text/|RACAI]] (web)| |
| ^Russian| | x | | | x | x | | | x | | __x__ | | |
| ^Serbian| | x | x | | | | | | | | |[[ http://nlp.ffzg.hr/resources/models/tagging/|Nikola Ljubešić]] (tool)| |
| ^Slovak| | | | | | | | | x | | x |__[[ http://ufal.mff.cuni.cz/morce/index.php|Morče]]__ (tool)| |
| ^Slovene| | | x | | | | | | x | | |__[[ http://nl.ijs.si/analyse/|ToTaLe]]__ (tool)| |
| ^Spanish| x | x | | | x | | x | x | | | __x__ | | |
| ^Swahili| | | | | | | | | | | x | | |
| ^Swedish| | x | | x | | | | x | | | |__[[ http://www.ling.su.se/english/nlp/tools/stagger/stagger-the-stockholm-tagger-1.98986|Stagger]]__ (tool)| |
| ^Telugu| | | | | | | x | | | | | | |
| ^Thai| | | | | | | | x | | | | | |
| ^Turkish| | x | | | | | | | | | |[[ http://tools.nlp.itu.edu.tr|ITU Turkish Natural Language Processing Pipeline]] (web), [[ http://coltekin.net/cagri/trmorph/|Trmorph]] (tool, MA)| |
| ^Ukrainian| | x | | | | x | | | | | |[[ http://ugtag.sourceforge.net|ugtag]] (tool)| |
| ^Vietnamese| | | | | | | | x | | | |[[ http://mim.hus.vnu.edu.vn/phuonglh/softwares|vnTagger]], [[ http://vlsp.vietlp.org:8080/demo/?&lang=en|Vietnamese Language and Speech Processing (VLSP) / VietTagger]] (tools)| |
| ^Welsh| | x | | | x | | | | | | | | |
| |
For additional resources see Wiki of the Association for Computational Linguistics - [[http://www.aclweb.org/aclwiki/index.php?title=List_of_resources_by_language|List of resources by language]]. | |
| |
Tools of varied coverage for more languages may be found at [[https://languagetool.org]]. | <fs x-small>Note: The list does not include tools without a disambiguation component, such as morphological analyzers [[http://nlp.fi.muni.cz/projekty/ajka/ajkacz.htm|Ajka]] or [[http://nlp.fi.muni.cz/czech-morphology-analyser/|Majka]].</fs> |
| |
The list does not include tools without a disambiguation component, such as morphological analyzers [[http://nlp.fi.muni.cz/projekty/ajka/ajkacz.htm|Ajka]] or [[http://nlp.fi.muni.cz/czech-morphology-analyser/|Majka]]. | |
| |
Tools currently used in [[ http://ucnk.ff.cuni.cz/intercorp/?req=page:info|InterCorp ]], the parallel section of the Czech National Corpus, are underlined. | <WRAP round info 75%> |
| For additional resources see Wiki of the Association for Computational Linguistics – [[http://www.aclweb.org/aclwiki/index.php?title=List_of_resources_by_language|List of resources by language]] and [[https://languagetool.org|list]] of tools of varied coverage for more languages. |
| |
| Tools currently used in [[cnk:intercorp|InterCorp]], the parallel section of the Czech National Corpus, are underlined. |
| </WRAP> |
| |
| --- //Alexandr Rosen & corpora@uib.no subscribers// |
| |
| |