====== Park – the InterCorp User Interface: HOWTO in brief ====== InterCorp is accessible by the same login as for other CNK corpora - you can get it for free by registering on the [[http://korpus.cz/english/prohlaseni-aj.php|Registration]] page or by using the top right link on the login page. After entering your user ID and password a page opens with a link to enter a new query and a switch between the current and the previous version of the corpus. If you return to this page later from the navigation menu by clicking **Home**, a list of still active queries will be shown. Clicking on a query recalls it. ===== Restricting the search scope: ===== Clicking on **New query** opens a page with a list of currently available languages and texts. At first you need to left-click on the check boxes next to at least two languages. If you wish to search all texts in the core while ignoring both the collections of automatically processed texts and the option to select specific texts to be searched, you can proceed straight to the specification of your query. If you wish to search the collections in addition to the core, you can specify **Include** for each individual collection, as long as the collection is available for your choice of languages. The collections are added to the whole core and you can proceed to entering the query. If you wish to restrict the set of searched texts by selection criteria applied to known parameters, you can use a filter in the following way: * At first make sure that you have selected the languages you wish to search. * Then choose the language to which the filter should be applied. (This is done by a left-click directly on the name of the language to which the filter should be applied. The language should be in dark grey. If you choose, e.g., English as the language of the filter, the list of texts will show info about the English versions of the texts. If you then use as a filter, e.g., the publication year until 2001, the filter is based on the year of publication of the English version of the text.) * Then you can select (check) or unselect text types, depending on known bibliographical data. * As a next step you can apply the filter to the collections – by selecting the option **Apply filter**. (If you select the option **Include**, the filter will be applied only to texts from the core, but the collection will be included as a whole, disregarding the filter. If you choose **Exclude**, the collection will not be searched, no matter how the filter is set up.) * Finally you can fine-tune the selection specified by the filter. Clicking on **Manual text selection** brings up a list of specific texts. * After selecting **Filter texts** only texts which satisfy the filter set above are checked. But you can fine tune this setting depending on your needs. The steps must be followed in that sequence becouse change in any preceding step usually resets the setting of all subsequent steps. ===== Making a query: ===== * Searching in one or more languages in parallel * Searching by wordform * Searching by string of wordforms (**a phrase**) * Searching by CQL expression * Searching by lemma (base form) - for some languages * Searching by a [[en:cnk:intercorp#morphosyntactic_annotation|morphosyntactic tag]] (the CQL format must be used) – for some languages * [[http://ucnk.ff.cuni.cz/bonito/regular.php|Regular expressions]] as an option * Virtual keyboard to type in foreign characters ===== Displaying parallel concordances: ===== * Structural tags (**Concordance/Show options/Structures**) * Bibliographical data and concordance ID (**Concordance/Show options/References**) * Lemma and/or morphosyntactic tag for keyword or all displayed words - for some languages (**Concordance/Show options/Attributes**) * Filtering of results based on the presence or absence of specified expressions in the context (**Concordance/Show filter**) * Displaying whole sentences (**Segment**) or lines with the keyword in the middle (**Kwic**) * An option to display more languages side by side or on top of each other (**View: vertical/horizontal**) * An option to display wider context (**Show context**) * Export of concordances as a table (**Export: xls1, xls2**) ===== An option to go back to previous queries and results ===== * Click on **Home** on the top navigation bar ===== Some issues related to tokenization and morphosyntactic tags: ===== Straightforward queries including **contracted forms** into tagged or [[en:pojmy:lemma|lemmatized]] texts may fail. This includes forms such as //can't// or //I'm//, which are split by the tagger into two parts (//ca//+//n't// and //I//+//'m//) with corresponding lemmas and tags. Similarly with Polish forms //byłam// or //gdybyś// (//była//+//m// and //gdyby//+//ś//). Tokenization may even introduce errors: //gdzie ś za Wisłą//. A query intended to find the whole contracted form should be typed in as a **Phrase**, with the split parts separated by a space. Only the individual parts of the contracted form are assigned a tag and a lemma. Morphological tags including characters with a special meaning in regular expressions, e.g. "%%$%%" in the English tag "wp%%$%%", must be preceded in queries by a backslash: tag="wp\%%$%%". See [[en:cnk:intercorp#morphosyntactic_annotation|description]] of the corpus for more details on morphosyntactic tags. Last update: //10 April 2013//