AplikaceAplikace
Nastavení

Toto je starší verze dokumentu!


Taggers & lemmatizers

Ahmet Akker (tool)BTTtagger (tool)COMPOST (tool)Freeling (tool)MYSTEM (tool)NLTK (tool)RDRPOSTagger (tool)RFTagger (tool)Stanford (tool)Treetagger (tool)Other
Arabic x Madamira (web service, tool)
Asturian x
Belarusian x
Bengali x
Bulgarian x x DCL (tool)
Catalan x x
Chinese x
Croatian x Nikola Ljubešić (tool)
Czech x x x x MorphoDiTa (tool)
Danish x CST (web, tool)
Dutch x x x x x Brill-NL (web, tool)
English x x x x x MorphoDiTa (tool)
Estonian x x
Finnish x OMorFi (tool)
French x x x x x
Galician x x
German x x x x
Greek ILSP (web)
Hebrew MILA (tool)
Hindi x x Siva Reddy (tool), Hindi Shallow Parser
Hungarian x x hunpos (tool)
Icelandic x IceStagger (tool)
Indonesian x
Italian x x x x x
Japanese x mecab (tool)
Lao x
Macedonian x
Maltese Maltese Language Resource Server (web)
Malay
Marathi x
Mongolian x
Norwegian obt (tool)
Persian hazm (tool)
Polish x x x TaKIPI, Pantera, Concraft, WCRFT (tools)1)
Portuguese x x x x
Romanian x RACAI (web)
Russian x x x x
Serbian x Nikola Ljubešić
Slovak x x Morče (tool)
Slovene x x ToTaLe (tool)
Spanish x x x x x
Swahili x
Swedish x x Stagger (tool)
Telugu x
Thai x
Turkish ITU Turkish Natural Language Processing Pipeline (web), Trmorph (tool, MA)
Ukrainian x ugtag (tool)
Vietnamese x vnTagger (tool), Vietnamese Language and Speech Processing (VLSP) / VietTagger
Welsh x

For additional resources see Wiki of the Association for Computational Linguistics – List of resources by language.

Tools of varied coverage for more languages may be found at https://languagetool.org.

The list does not include tools without a disambiguation component, such as morphological analyzers Ajka or Majka.

Tools currently used in InterCorp, the parallel section of the Czech National Corpus, are underlined.

Alexandr Rosen & corpora@uib.no subscribers