Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
en:obc:query_types [2020/02/14 17:35] – Jan Kocek | en:obc:query_types [2021/02/16 11:24] (current) – Michal Křen | ||
---|---|---|---|
Line 3: | Line 3: | ||
If this is your first time working with the KonText corpus manager, you may benefit from reading the general [[https:// | If this is your first time working with the KonText corpus manager, you may benefit from reading the general [[https:// | ||
- | After successfully completing the online registration and logging into KonText, you can begin with your first query. Firstly, the corpus you intend to work with needs to be selected; the default corpus in the KonText interface is the '' | + | After successfully completing the online registration and logging into KonText, you can begin with your first query. Firstly, the corpus you intend to work with needs to be selected; the default corpus in the KonText interface is the //syn2015// corpus. By clicking on the name in the blue box, you can access a list of all the corpora available to you. If you have worked with KonText before, you may find a list of your favourite corpora on the left, while the featured corpora list is located in the right column. To find the OBC, click on //All corpora//, type in the name, '' |
- | {{Obrázek_1.png|Obrázek_1.png Obrázek_1.png}} | + | {{: |
You may then save it as one of your favourite corpora by clicking on the star next to the blue box. Now you can type the first query into the query box. | You may then save it as one of your favourite corpora by clicking on the star next to the blue box. Now you can type the first query into the query box. | ||
- | < | + | **Basic information** |
To see the basic information about OBC, simply click on the name of the corpus under the KonText logo. | To see the basic information about OBC, simply click on the name of the corpus under the KonText logo. | ||
- | {{Obrázek_5.png|Obrázek_5.png Obrázek_5.png}} | + | {{: |
What you can learn from this is the total number of positions, and the number of different attributes and structures in the corpus, however this does not provide the full list of metadata available for this corpus. | What you can learn from this is the total number of positions, and the number of different attributes and structures in the corpus, however this does not provide the full list of metadata available for this corpus. | ||
- | < | + | **First Query** |
- | With the query type set on “Basic”, you can type any word or multiple words into the query line. Then just click on the search button or press the enter key. | + | With the query type set on //Basic//, you can type any word or multiple words into the query line. Then just click on the search button or press the enter key. |
- | Try searching:# A simple word, such as // | + | <WRAP round help 40%> |
+ | Try searching: | ||
+ | - A simple word, such as // | ||
- Names of the British rulers of the period – try searching for //George III// and //George the third//. | - Names of the British rulers of the period – try searching for //George III// and //George the third//. | ||
- Punctuation marks, such as //!//, //?//, or //;//. | - Punctuation marks, such as //!//, //?//, or //;//. | ||
- Some words which were coined in the 18< | - Some words which were coined in the 18< | ||
+ | </ | ||
After clicking on the search button (or pressing the enter key), you will be presented with the [[https:// | After clicking on the search button (or pressing the enter key), you will be presented with the [[https:// | ||
- | {{Obrázek_3.png|Obrázek_3.png Obrázek_3.png}} | + | {{: |
The searched word or phrase appears in pink and is called [[https:// | The searched word or phrase appears in pink and is called [[https:// | ||
Line 36: | Line 39: | ||
|**Query** | |**Query** | ||
- | |// | + | |// |
- | |//George III// | + | |//George III// |
- | |// | + | |// |
- | |// | + | |// |
- | | | | | | + | |
- | It should be noted here that the absolute [[https:// | + | It should be noted here that the absolute [[https:// |
For basic statistical tasks (e.g. comparing frequencies in corpora of different sizes), you might find useful the corpus calculator [[https:// | For basic statistical tasks (e.g. comparing frequencies in corpora of different sizes), you might find useful the corpus calculator [[https:// | ||
- | < | + | **New Query** |
- | To start a new search in KonText, click on the item Query → New Query located in the top menu, or simply click on the KonText logo in the top left corner. | + | To start a new search in KonText, click on the item //Query → New Query// located in the top menu, or simply click on the KonText logo in the top left corner. |
- | < | + | **Query Types** |
There are five query types offered for the OBC: basic, phrase, word form, word part, and CQL. Since the OBC is not lemmatized, there is no option to search for lemmas. Each query type is best suitable for different kind of research. | There are five query types offered for the OBC: basic, phrase, word form, word part, and CQL. Since the OBC is not lemmatized, there is no option to search for lemmas. Each query type is best suitable for different kind of research. | ||
- | < | + | **Query Type: Basic** |
Basic query is often used to familiarize oneself with the given corpus. It is ideal for elementary searches which do not require great accuracy. As the OBC is not lemmatized, basic query searches only for the word forms which match the query perfectly. The search is case-insensitive and does not support regular expressions. | Basic query is often used to familiarize oneself with the given corpus. It is ideal for elementary searches which do not require great accuracy. As the OBC is not lemmatized, basic query searches only for the word forms which match the query perfectly. The search is case-insensitive and does not support regular expressions. | ||
- | < | + | **Query Type: Word Form** |
- | Word form, or node form, query is used for the analysis of one specific form of a word, as it finds only the forms which match the query exactly. By default, it is case-insensitive; | + | Word form, or node form, query is used for the analysis of one specific form of a word, as it finds only the forms which match the query exactly. By default, it is case-insensitive; |
- | < | + | **Query type: Phrase** |
Phrase query is especially useful, as it allows searching for multiword expressions. It finds the exact wording of the phrase inputted. It supports regular expressions. | Phrase query is especially useful, as it allows searching for multiword expressions. It finds the exact wording of the phrase inputted. It supports regular expressions. | ||
- | Try searching for the phrase //corporal punishment// | + | <WRAP round help 40%> |
+ | Try searching for the phrase //corporal punishment// | ||
+ | * When the search is case-insensitive (i.e. the //Match case// box is left unticked), you will find the following forms: //corporal punishment//, | ||
* When you tick the box and make the search case-sensitive, | * When you tick the box and make the search case-sensitive, | ||
+ | </ | ||
- | < | + | **Query Type: Word Part** |
- | If you wish to find all words containing a particular string of consecutive characters, preceded and followed by any number of other characters (or none), the most suitable query type for this task is the Word Part (Character) query. It supports regular expressions and is case-sensitive, | + | If you wish to find all words containing a particular string of consecutive characters, preceded and followed by any number of other characters (or none), the most suitable query type for this task is the Word Part (Character) query. It supports regular expressions and is case-sensitive, |
- | Try searching for all the words that contain the following word parts:* //hood// | + | <WRAP round help 40%> |
+ | Try searching for all the words that contain the following word parts: | ||
+ | * //hood// | ||
* //blood// | * //blood// | ||
* //counter// | * //counter// | ||
+ | </ | ||
- | To see all the words containing the particular word part, you may go through the concordance manually, however the easiest way to get an overview of all the words is to click on the Frequency button located at the upper part of the page and then select Node forms from the dropdown menu. A list of all the words containing the particular string of characters arranged according to their absolute frequency will appear. | + | To see all the words containing the particular word part, you may go through the concordance manually, however the easiest way to get an overview of all the words is to click on the //Frequency// button located at the upper part of the page and then select |
- | {{Obrázek_4.png|Obrázek_4.png Obrázek_4.png}} | + | {{: |
- | < | + | **Query Type: CQL** |
All the query types mentioned above are internally converted into [[https:// | All the query types mentioned above are internally converted into [[https:// | ||
+ | |||
+ | ---- | ||
+ | |||
+ | **If you are ready, you can continue to [[en: | ||
+ | |||
+ | ---- | ||