Both sides previous revisionPrevious revisionNext revision | Previous revision |
en:kurz:hledani_v_paralelnim_korpusu [2019/12/19 23:43] – [Main differences in comparison with the Park interface] alexandrrosen | en:kurz:hledani_v_paralelnim_korpusu [2022/11/23 15:09] (current) – [Entering a query] alexandrrosen |
---|
| |
[[https://kontext.korpus.cz/first_form|KonText]] is an integrated interface for searching both monolingual and parallel corpora. After entering your user ID and password a page with a default corpus opens (e.g. **syn2015**). After clicking **All corpora** and **InterCorp** a list of available languages shows up. | [[https://kontext.korpus.cz/first_form|KonText]] is an integrated interface for searching both monolingual and parallel corpora. After entering your user ID and password a page with a default corpus opens (e.g. **syn2015**). After clicking **All corpora** and **InterCorp** a list of available languages shows up. |
| |
| For a detailed description of KonText and its features see the [[https://wiki.korpus.cz/doku.php/en:manualy:kontext:index|KonText interface manual]]. |
| Here you find only a few basic hints and specifics of using KonText to query InterCorp. |
| |
====Selecting languages==== | ====Selecting languages==== |
| |
Click on one of the languages, such as **InterCorp v9 Czech** to choose the primary language for your search. For the primary language a non-empty query is required. The query box for this language must be filled in. The order of the languages also matters when you wish to create a subcorpus (see below). The range of texts to create subcorpora can be specified only for the primary language. In other respects, the order of languages is irrelevant. | Click on one of the languages, such as **InterCorp v15 Czech** to choose the primary language for your search. For the primary language a non-empty query is required. The query box for this language must be filled in. The order of the languages also matters when you wish to create a subcorpus (see below). The range of texts to create subcorpora can be specified only for the primary language. In other respects, the order of languages is irrelevant. |
| |
After choosing the primary language a brief description of the selected part of the corpus appears in the page heading together with its size, measured in the number of tokens (so-called positions, i.e. word forms and punctuation symbols). To add an additional language choose the relevant corpus part within the frame **Aligned corpora** and then click on **Add**. For the additional language a query need not be entered. Tick **include empty lines** if you wish the result to include concordances that do not have an equivalent in the given language. More languages can be added in a similar way. Searching one part of the parallel corpus only, i.e. within a single language, is also possible. If so, do not add other languages and proceed to selecting the type of query and specifying the query itself. | After choosing the primary language a brief description of the selected part of the corpus appears in the page heading together with its size, measured in the number of tokens (so-called positions, i.e. word forms and punctuation symbols). To add an additional language choose the relevant corpus part within the frame **Aligned corpora** and then click on **Add**. For the additional language a query need not be entered. Tick **include empty lines** if you wish the result to include concordances that do not have an equivalent in the given language. More languages can be added in a similar way. Searching one part of the parallel corpus only, i.e. within a single language, is also possible. If so, do not add other languages and proceed to selecting the type of query and specifying the query itself. |
====Entering a query==== | ====Entering a query==== |
| |
You can choose from six **Query Types** (see below). All types of queries except **Basic** are case-sensitive and can handle regular expressions. For the query type **Word Form** the default is case-insensitive but **Match case** can be turned on. For the second and other languages you can also specify whether the concordances should or should not include terms specified in the query box. | You can switch between the simple and **Advanced Query** options. In the Advanced Query option you can use the [[en:pojmy:cql|Corpus Query Language]]. Using the **CQL** language you can search for one or more word forms according to the given expression. While entering morphological tags for Czech the user might find useful the helper option **insert tag**, which allows to enter codes at the appropriate position of the tag using a menu of attributes and their corresponding values. All languages include the **insert "within"** option, which helps to filter the query results according to metadata, ie bibliographic and other data relating to the texts. For a list of attributes and their values, see [[http://ucnk.ff.cuni.cz/intercorp/?req=page:metadata&lang=en|here]]. The 'attribute="value"' pairs can be combined using the operator & (logical conjunction). The whole "within" condition must be placed at the end of a query, following expressions specifying one or more positions (in brackets). A single query can include multiple "within" conditions. The following two example queries produce identical results, namely sentences including nouns in the vocative case in original Czech dramas: |
| |
* **Basic** - searches for the given word form, case-insensitive, if the given form is at the same time a basic dictionary form ([[en:pojmy:lemma|lemma]]), searches also for all of its inflected forms | |
* **Lemma** - searches for all forms of the given lemma | |
* **Phrase** - searches for the given sequence of word forms | |
* **Word form** - searches for the given word form | |
* **Character** - search for word forms containing the given sequence of characters | |
* **CQL** - searches for one or more word forms according to the given expression in the [[https://www.sketchengi.co.uk/corpus-querying/|**CQL**]] query language. While entering morphological tags for Czech the user might find useful the helper option **insert tag**, which allows to enter codes at the appropriate position of the tag using a menu of attributes and their corresponding values. All languages include the **insert "within"** option, which helps to filter the query results according to metadata, ie bibliographic and other data relating to the texts. For a list of attributes and their values, see [[http://ucnk.ff.cuni.cz/intercorp/?req=page:metadata&lang=en|here]]. The 'attribute="value"' pairs can be combined using the operator & (logical conjunction). The whole "within" condition must be placed at the end of a query, following expressions specifying one or more positions (in brackets). A single query can include multiple "within" conditions. The following two example queries produce identical results, namely sentences including nouns in the vocative case in original Czech dramas: | |
| |
<code> | <code> |