AplikaceAplikace
Nastavení

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
en:cnk:lestrepublicain [2015/10/24 11:32] – created vaclavhorkyen:cnk:lestrepublicain [2024/08/04 09:20] (current) – [Corpus lEstRepublicain] michalkren
Line 2: Line 2:
 ====== Corpus lEstRepublicain ====== ====== Corpus lEstRepublicain ======
  
-Corpus consists of 3 volumes (1999, 2002, 2003; not all of them complete) of French regional newspaper L'Est Républicain. It contains almost 120 million words and it was built from [[http://www.cnrtl.fr/corpus/estrepublicain/|CNRTL data]]. The corpus is lemmatised and POS-tagged by [[http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/|TreeTagger]].+Corpus consists of 3 volumes (1999, 2002, 2003; not all of them complete) of French regional newspaper L'Est Républicain. After the deduplication it contains almost 73 million words in version 2 (v1 had almost 120 million wordsand it was built from [[http://www.cnrtl.fr/corpus/estrepublicain/|CNRTL data]]. The corpus is lemmatised and POS-tagged by [[http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/|TreeTagger]].
  
-For technical reasons, corpus lEstRepublicain is not included in the standard corpus list for Bonito 1; it is only available via the web interface.