AplikaceAplikace
Nastavení

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
en:cnk:intercorp:verze16ud [2024/09/24 09:11] – [Texts in the corpus] alexandrrosenen:cnk:intercorp:verze16ud [2024/09/24 09:14] (current) – [Number of texts in the Core] alexandrrosen
Line 77: Line 77:
   * Political commentaries published by [[http://www.project-syndicate.org/|Project Syndicate]] (below referred to as **Syndicate**) and [[http://www.voxeurop.eu|VoxEurop]] (formerly **PressEurop**)   * Political commentaries published by [[http://www.project-syndicate.org/|Project Syndicate]] (below referred to as **Syndicate**) and [[http://www.voxeurop.eu|VoxEurop]] (formerly **PressEurop**)
   * A package of legal texts of the European Union form the [[https://ec.europa.eu/jrc/en/language-technologies/jrc-acquis|Acquis Communautaire]] corpus (**Acquis**)   * A package of legal texts of the European Union form the [[https://ec.europa.eu/jrc/en/language-technologies/jrc-acquis|Acquis Communautaire]] corpus (**Acquis**)
-  * Proceedings of the European Parliament dated 2007–2011 from the [[http://www.statmt.org/europarl/|**Europarl**]] corpus +  * Proceedings of the European Parliament dated 2007–2011 from the [[http://www.statmt.org/europarl/|Europarl]] corpus (**Europarl**) 
-  * Film subtitles from the [[http://www.opensubtitles.org/|Open **Subtitles**]] database+  * Film subtitles from the [[http://www.opensubtitles.org/|Open Subtitles]] database (**Subtitles**)
   * Translations of the **Bible**   * Translations of the **Bible**
  
Line 134: Line 134:
  
  
-In the tables below, the Core part of the corpus is split according to the text type into fiction, non-fiction, and "misc" (for "miscellaneous"such as drama, poetry or children's literature). +In the tables below, the Core part of the corpus is split according to the text type into fiction (**Core-fiction**), non-fiction (**Core-nonfiction**), and miscellaneous (**Core-misc**)including drama, poetry or children's literature). 
  
 ==== Corpus size by collection ==== ==== Corpus size by collection ====