Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
en:manualy:treq [2017/05/19 11:18] – [Alignment principle] michalskrabal | en:manualy:treq [2022/12/30 17:41] (current) – capka | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== Treq ====== | ====== Treq ====== | ||
- | {{ : | + | {{ : |
- | The [[http:// | + | The [[http:// |
- | Treq is an online application (the only thing we need to use it is a web browser) and it is accessible without [[en: | + | Treq is an online application (the only thing we need to use it is a web browser) and it is accessible without [[en: |
To use Treq, start by specifying the desired language pair by selecting source language (the language of the query) and target language (the language of the potential equivalents). The query can be entered either as a specific word form, as a lemma (// | To use Treq, start by specifying the desired language pair by selecting source language (the language of the query) and target language (the language of the potential equivalents). The query can be entered either as a specific word form, as a lemma (// | ||
Line 19: | Line 19: | ||
{{: | {{: | ||
- | i.e. the first word in the source language (0) corresponds to the first word in the target language (0), the second word (1) corresponds to the third one (2) etc. Starting with release 2.0, apart from this simple alignment method the // | + | that is, the first word in the source language (0) corresponds to the first word in the target language (0), the second word (1) corresponds to the third one (2) etc. Starting with release 2.0, apart from this simple alignment method the // |
{{: | {{: | ||
Line 27: | Line 27: | ||
(Note the difference: the first word in the target language (0) now corresponds not only to the first (0), but also the second (1) word in the target language.) | (Note the difference: the first word in the target language (0) now corresponds not only to the first (0), but also the second (1) word in the target language.) | ||
- | From such an alignment we choose, using a simple script, the largest possible number of combinations of words that this alignment allows. In both cases, the aligned pairs of (multiple) words are then sorted and summarized. The result of this automatic excerption is not revised in any way. However, the relative frequency of the corresponding pairs may serve as an indicator of the relevance of the equivalents. The more often the equivalent of the word or multi-word unit occurs in comparison with other equivalents, | + | From such an alignment we choose |
- | The table below indicates in what proportion the frequencies found in the KonText are with those displayed by Treq. It also specifies the different data types at each stage of their processing for Treq, considering the IC v9 English component (multi-word variant). | + | The table below indicates in what proportion the frequencies found in KonText are with those displayed by Treq. It also specifies the different data types at each stage of their processing for Treq, considering the IC v9 English component (multi-word variant). |
{{: | {{: | ||
- | Step by step, you can see the gradual loss of data that is used in the resulting dictionary. In the first step, we only use 1:1 sentence alignment - thus 20.7% of sentences are lost. Subsequently, | + | Step by step, you can see the gradual loss of data that is used in the resulting dictionary. In the first step, we only use a 1:1 sentence alignment |
//a – and// | //a – and// | ||
Line 63: | Line 63: | ||
//. – .// | //. – .// | ||
- | In the third step, lines that are the same on both sides of the alignment are added together throughout the text. This will give us the list and the frequency of the equivalents. Finally, in the final step, we exclude all the counterparts containing the punctuation in order to get the final version of the dictionary. For all language pairs where the lemmatization is available on both sides of the alignment, we apply the same procedure to the lemmatized form of data (//na počátek být stvořit vesmír . – in the beginning the universe be create .//). | + | In the third step, lines that are the same on both sides of the alignment are added together throughout the text. This will give us the list and the frequency of the equivalents. Finally, in the last step, we exclude all the counterparts containing the punctuation in order to get the final version of the dictionary. For all language pairs where the lemmatization is available on both sides of the alignment, we apply the same procedure to the lemmatized form of data (//na počátek být stvořit vesmír . – in the beginning the universe be create .//). |
===== Application pictures ===== | ===== Application pictures ===== | ||
Line 70: | Line 70: | ||
[{{: | [{{: | ||
[{{: | [{{: | ||
+ | |||
+ | ===== How to cite Treq ===== | ||
+ | |||
+ | <WRAP round tip 80%> | ||
+ | Vavřín, M. – Rosen, A.: Treq. FF UK. Praha 2015. Available on WWW: < | ||
+ | |||
+ | Škrabal, M. – Vavřín, M. (2017): Databáze překladových ekvivalentů Treq. //Časopis pro moderní filologii// 99 (2), s. 245–260. | ||
+ | </ | ||
==== Related links ==== | ==== Related links ==== |