AplikaceAplikace
Nastavení

Toto je starší verze dokumentu!


frWaC:

a 1.6 billion word corpus constructed from the Web limiting the crawl to the .fr domain and using medium-frequency words from the Le Monde Diplomatique corpus and basic French vocabulary lists as seeds. The corpus was POS-tagged and lemmatized with the TreeTagger, more information available here.

http://wacky.sslmit.unibo.it/doku.php?id=corpora