Notice: The list is not kept up to date (last update 2/2015).
| Ahmet Akker (tool) | Apertium (tool) | BTTtagger (tool) | COMPOST (tool) | Freeling (tool) | MYSTEM (tool) | NLTK (tool) | RDRPOSTagger (tool) | RFTagger (tool) | Stanford (tool) | Treetagger (tool) | Other | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Arabic | x | x | Madamira (web, tool) | |||||||||
| Asturian | x | x | ||||||||||
| Belarusian | x | |||||||||||
| Bengali | x | |||||||||||
| Bulgarian | x | x | x | DCL (tool) | ||||||||
| Catalan | x | x | x | |||||||||
| Chinese | x | |||||||||||
| Croatian | x | x | Nikola Ljubešić (tool) | |||||||||
| Czech | x | x | x | x | MorphoDiTa (tool) | |||||||
| Danish | x | x | CST (web, tool) | |||||||||
| Dutch | x | x | x | x | x | x | Brill-NL (web, tool), Frog (tool) | |||||
| English | x | x | x | x | x | x | MorphoDiTa (tool) | |||||
| Estonian | x | x | ||||||||||
| Finnish | x | OMorFi (tool) | ||||||||||
| French | x | x | x | x | x | x | x | |||||
| Galician | x | x | x | |||||||||
| German | x | x | x | x | x | |||||||
| Greek | ILSP (web) | |||||||||||
| Hebrew | x | MILA (tool) | ||||||||||
| Hindi | x | x | x | Siva Reddy (tool), Hindi Shallow Parser (web) | ||||||||
| Hungarian | x | x | hunpos (tool) | |||||||||
| Icelandic | x | x | IceStagger (tool) | |||||||||
| Indonesian | x | |||||||||||
| Italian | x | x | x | x | x | x | ||||||
| Japanese | x | mecab (tool) | ||||||||||
| Lao | x | |||||||||||
| Macedonian | x | x | ||||||||||
| Maltese | x | Maltese Language Resource Server (web) | ||||||||||
| Malay | ||||||||||||
| Marathi | x | |||||||||||
| Mongolian | x | |||||||||||
| Norwegian | x | obt (tool) | ||||||||||
| Persian | hazm (tool) | |||||||||||
| Polish | x | x | x | TaKIPI, Pantera, Concraft, WCRFT (tools)1) | ||||||||
| Portuguese | x | x | x | x | x | |||||||
| Romanian | x | x | RACAI (web) | |||||||||
| Russian | x | x | x | x | x | |||||||
| Serbian | x | x | Nikola Ljubešić (tool) | |||||||||
| Slovak | x | x | Morče (tool) | |||||||||
| Slovene | x | x | ToTaLe (tool) | |||||||||
| Spanish | x | x | x | x | x | x | ||||||
| Swahili | x | |||||||||||
| Swedish | x | x | x | Stagger (tool) | ||||||||
| Telugu | x | |||||||||||
| Thai | x | |||||||||||
| Turkish | x | ITU Turkish Natural Language Processing Pipeline (web), Trmorph (tool, MA) | ||||||||||
| Ukrainian | x | x | ugtag (tool) | |||||||||
| Vietnamese | x | vnTagger, Vietnamese Language and Speech Processing (VLSP) / VietTagger (tools) | ||||||||||
| Welsh | x | x |
Note: The list does not include tools without a disambiguation component, such as morphological analyzers Ajka or Majka.
For additional resources see Wiki of the Association for Computational Linguistics – List of resources by language and list of tools of varied coverage for more languages.
Tools currently used in InterCorp, the parallel section of the Czech National Corpus, are underlined.
— Alexandr Rosen & corpora@uib.no subscribers