| Both sides previous revisionPrevious revisionNext revision | Previous revision |
| en:cnk:intercorp:verze16ud [2025/01/10 20:41] – [InterCorp Release 16ud – Universal Dependencies] alexandrrosen | en:cnk:intercorp:verze16ud [2025/05/11 14:35] (current) – [Texts in the corpus] alexandrrosen |
|---|
| ^ ::: ^ publication date | 2024 ^^^^ | ^ ::: ^ publication date | 2024 ^^^^ |
| ^ ::: ^ foreign languages | 61 ^^^^ | ^ ::: ^ foreign languages | 61 ^^^^ |
| ^ ::: ^ tagged languages | 49 ^^^^ | ^ ::: ^ tagged languages | 48 ^^^^ |
| ^ ::: ^ lemmatized languages | 49 ^^^^ | ^ ::: ^ lemmatized languages | 48 ^^^^ |
| ^ ::: ^ syntactically annotated languages| 49 ^^^^ | ^ ::: ^ syntactically annotated languages| 48 ^^^^ |
| |
| ===== Access to the texts ===== | ===== Access to the texts ===== |
| |
| * Political commentaries published by [[http://www.project-syndicate.org/|Project Syndicate]] (below referred to as **Syndicate**) and [[http://www.voxeurop.eu|VoxEurop]] (formerly **PressEurop**) | * Political commentaries published by [[http://www.project-syndicate.org/|Project Syndicate]] (below referred to as **Syndicate**) and [[http://www.voxeurop.eu|VoxEurop]] (formerly **PressEurop**) |
| * A package of legal texts of the European Union form the [[https://ec.europa.eu/jrc/en/language-technologies/jrc-acquis|Acquis Communautaire]] corpus (**Acquis**) | * A colection of legal texts of the European Union form the [[https://ec.europa.eu/jrc/en/language-technologies/jrc-acquis|Acquis Communautaire]] corpus (**Acquis**) |
| * Proceedings of the European Parliament dated 2007–2011 from the [[http://www.statmt.org/europarl/|Europarl]] corpus (**Europarl**) | * Proceedings of the European Parliament dated 2007–2011 from the [[http://www.statmt.org/europarl/|Europarl]] corpus (**Europarl**) |
| * Film subtitles from the [[http://www.opensubtitles.org/|Open Subtitles]] database (**Subtitles**) | * Film subtitles from the [[http://www.opensubtitles.org/|Open Subtitles]] database (**Subtitles**) |
| ^[[https://www.loc.gov/standards/iso639-2/php/langcodes_name.php?iso_639_1=pt|pt]]| 107| 147 063| 46 510.1| 280 566.2| 355 121.8| | ^[[https://www.loc.gov/standards/iso639-2/php/langcodes_name.php?iso_639_1=pt|pt]]| 107| 147 063| 46 510.1| 280 566.2| 355 121.8| |
| ^[[https://en.wikipedia.org/wiki/Romani_language|rn]]| 2| 2| 1.7| 13.6| 17.7| | ^[[https://en.wikipedia.org/wiki/Romani_language|rn]]| 2| 2| 1.7| 13.6| 17.7| |
| ^[[https://www.loc.gov/standards/iso639-2/php/langcodes_name.php?iso_639_1=ru|ru]]| 55| 102 904| 39 561.2| 235 702.3| 295 301.3| | |
| ^[[https://www.loc.gov/standards/iso639-2/php/langcodes_name.php?iso_639_1=ro|ro]]| 184| 32 839| 22 985.2| 122 130.4| 163 120.7| | ^[[https://www.loc.gov/standards/iso639-2/php/langcodes_name.php?iso_639_1=ro|ro]]| 184| 32 839| 22 985.2| 122 130.4| 163 120.7| |
| | ^[[https://www.loc.gov/standards/iso639-2/php/langcodes_name.php?iso_639_1=ru|ru]]| 55| 102 904| 39 561.2| 235 702.3| 295 301.3| |
| ^[[https://www.loc.gov/standards/iso639-2/php/langcodes_name.php?iso_639_1=si|si]]| 1| 499| 522.5| 2 313.4| 3 021.8| | ^[[https://www.loc.gov/standards/iso639-2/php/langcodes_name.php?iso_639_1=si|si]]| 1| 499| 522.5| 2 313.4| 3 021.8| |
| ^[[https://www.loc.gov/standards/iso639-2/php/langcodes_name.php?iso_639_1=sk|sk]]| 170| 94 585| 10 080.0| 74 862.7| 95 881.0| | ^[[https://www.loc.gov/standards/iso639-2/php/langcodes_name.php?iso_639_1=sk|sk]]| 170| 94 585| 10 080.0| 74 862.7| 95 881.0| |