Corpus consists of 3 volumes (1999, 2002, 2003; not all of them complete) of French regional newspaper L'Est Républicain. After the deduplication it contains almost 73 million words in version 2 (v1 had almost 120 million words) and it was built from CNRTL data. The corpus is lemmatised and POS-tagged by TreeTagger.