AplikaceAplikace
Nastavení

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
en:obc:query_types [2020/02/18 12:53] Michal Škrabalen:obc:query_types [2021/02/16 11:24] (current) Michal Křen
Line 3: Line 3:
 If this is your first time working with the KonText corpus manager, you may benefit from reading the general [[https://wiki.korpus.cz/doku.php/en:manualy:kontext:index|KonText manual]] or [[https://wiki.korpus.cz/doku.php/en:kurz:uvod|the course aimed at working with the EEBO corpus]]. If this is your first time working with the KonText corpus manager, you may benefit from reading the general [[https://wiki.korpus.cz/doku.php/en:manualy:kontext:index|KonText manual]] or [[https://wiki.korpus.cz/doku.php/en:kurz:uvod|the course aimed at working with the EEBO corpus]].
  
-After successfully completing the online registration and logging into KonText, you can begin with your first query. Firstly, the corpus you intend to work with needs to be selected; the default corpus in the KonText interface is the //syn2015// corpus. By clicking on the name in the blue box, you can access a list of all the corpora available to you. If you have worked with KonText before, you may find a list of your favourite corpora on the left, while the featured corpora list is located in the right column. To find the OBC, click on All corpora, type in the name, //obc2//, into the search box and select the corpus from the dropdown menu. Alternatively, you can use the direct link [[https://kontext.korpus.cz/first_form?corpname=obc2|here]].+After successfully completing the online registration and logging into KonText, you can begin with your first query. Firstly, the corpus you intend to work with needs to be selected; the default corpus in the KonText interface is the //syn2015// corpus. By clicking on the name in the blue box, you can access a list of all the corpora available to you. If you have worked with KonText before, you may find a list of your favourite corpora on the left, while the featured corpora list is located in the right column. To find the OBC, click on //All corpora//, type in the name, ''obc'', into the search box and select the corpus from the dropdown menu. Alternatively, you can use the direct link [[https://www.korpus.cz/kontext/query?corpname=obc|here]].
  
 {{:en:obc:l1_1.png?direct&600|}} {{:en:obc:l1_1.png?direct&600|}}
Line 19: Line 19:
 **First Query** **First Query**
  
-With the query type set on Basic, you can type any word or multiple words into the query line. Then just click on the search button or press the enter key.+With the query type set on //Basic//, you can type any word or multiple words into the query line. Then just click on the search button or press the enter key.
  
 <WRAP round help 40%> <WRAP round help 40%>
Line 39: Line 39:
  
 |**Query**     |**Number of hits**|**Relative frequency (i.p.m.)**| |**Query**     |**Number of hits**|**Relative frequency (i.p.m.)**|
-|//innocent//  |2572              |72,58                          | +|//innocent//  |2572              |72.58                          | 
-|//George III//|9                 |0,25                           | +|//George III//|9                 |0.25                           | 
-|//!//         |6650              |187,66                         | +|//!//         |6650              |187.66                         | 
-|//growl//     |2                 |0,06                           |+|//growl//     |2                 |0.06                           |
  
-It should be noted here that the absolute [[https://wiki.korpus.cz/doku.php/en:pojmy:frekvence|frequency]] of the given word (i.e. number of hits) requires further clarification: the size of the corpus or the absolute frequency of another word for comparison. The value of the relative frequency (i.e. absolute frequency in proportion to the total size of the corpus) indicates the number of times the given word occurs in a million words (i.p.m. = instances per million), and thus makes it possible to compare data from corpora of differing sizes. The OBC has approximately 35,5 million positions, hence the relative frequency of the word //innocent//, which occurs 2572 times in the corpus, is 72,58.+It should be noted here that the absolute [[https://wiki.korpus.cz/doku.php/en:pojmy:frekvence|frequency]] of the given word (i.e. number of hits) requires further clarification: the size of the corpus or the absolute frequency of another word for comparison. The value of the relative frequency (i.e. absolute frequency in proportion to the total size of the corpus) indicates the number of times the given word occurs in a million words (i.p.m. = instances per million), and thus makes it possible to compare data from corpora of differing sizes. The OBC has approximately 35.5 million positions, hence the relative frequency of the word //innocent//, which occurs 2,572 times in the corpus, is 72.58.
  
 For basic statistical tasks (e.g. comparing frequencies in corpora of different sizes), you might find useful the corpus calculator [[https://korpus.cz/calc/|Calc]]. For basic statistical tasks (e.g. comparing frequencies in corpora of different sizes), you might find useful the corpus calculator [[https://korpus.cz/calc/|Calc]].
Line 50: Line 50:
 **New Query** **New Query**
  
-To start a new search in KonText, click on the item Query → New Query located in the top menu, or simply click on the KonText logo in the top left corner.+To start a new search in KonText, click on the item //Query → New Query// located in the top menu, or simply click on the KonText logo in the top left corner.
  
 **Query Types** **Query Types**
Line 62: Line 62:
 **Query Type: Word Form** **Query Type: Word Form**
  
-Word form, or node form, query is used for the analysis of one specific form of a word, as it finds only the forms which match the query exactly. By default, it is case-insensitive; try searching the word //prisoner// – in the concordance, you will find //prisoner//, //Prisoner//, but also //PRISONER//. If you wish to make the search case-sensitive, tick the box Match case. If you now enter //Grace// with an upper-case G and tick the Match case” box, the resulting concordance will include only the exact matches, hence excluding //grace// with lower-case G. This query type supports regular expressions, hence the difference between the Basic and Word form query types is the possibility to conduct case-sensitive searches and use regular expressions.+Word form, or node form, query is used for the analysis of one specific form of a word, as it finds only the forms which match the query exactly. By default, it is case-insensitive; try searching the word //prisoner// – in the concordance, you will find //prisoner//, //Prisoner//, but also //PRISONER//. If you wish to make the search case-sensitive, tick the box //Match case//. If you now enter ''Grace'' with an upper-case G and tick the //Match case// box, the resulting concordance will include only the exact matches, hence excluding //grace// with lower-case G. This query type supports regular expressions, hence the difference between the Basic and Word form query types is the possibility to conduct case-sensitive searches and use regular expressions.
  
 **Query type: Phrase** **Query type: Phrase**
Line 71: Line 71:
 Try searching for the phrase //corporal punishment//. Try searching for the phrase //corporal punishment//.
  
-  * When the search is case-insensitive (i.e. the Match case” box is left unticked), you will find the following forms: //corporal punishment//, //corporal Punishment//, and //Corporal Punishment//.+  * When the search is case-insensitive (i.e. the //Match case// box is left unticked), you will find the following forms: //corporal punishment//, //corporal Punishment//, and //Corporal Punishment//.
   * When you tick the box and make the search case-sensitive, you will find only the exact phrase //corporal punishment//, all in lower-case letters.   * When you tick the box and make the search case-sensitive, you will find only the exact phrase //corporal punishment//, all in lower-case letters.
 </WRAP> </WRAP>
Line 87: Line 87:
 </WRAP> </WRAP>
  
-To see all the words containing the particular word part, you may go through the concordance manually, however the easiest way to get an overview of all the words is to click on the Frequency button located at the upper part of the page and then select Node forms from the dropdown menu. A list of all the words containing the particular string of characters arranged according to their absolute frequency will appear.+To see all the words containing the particular word part, you may go through the concordance manually, however the easiest way to get an overview of all the words is to click on the //Frequency// button located at the upper part of the page and then select //Node forms// from the dropdown menu. A list of all the words containing the particular string of characters arranged according to their absolute frequency will appear.
  
 {{:en:obc:l1_4.png?direct&600|}} {{:en:obc:l1_4.png?direct&600|}}
Line 94: Line 94:
  
 All the query types mentioned above are internally converted into [[https://wiki.korpus.cz/doku.php/en:pojmy:dotazovaci_jazyk|CQL (Corpus Query Language)]], it is therefore the most universal query type in the KonText interface. Its cornerstone is a query for a single position (word) in the corpus: [attribute=”value”] where the attribute is positional (word, lemma, tag etc.), the value is the search term itself, or a pattern specified with the help of regular expressions. However, we will focus on CQL in more advanced stages of the course. All the query types mentioned above are internally converted into [[https://wiki.korpus.cz/doku.php/en:pojmy:dotazovaci_jazyk|CQL (Corpus Query Language)]], it is therefore the most universal query type in the KonText interface. Its cornerstone is a query for a single position (word) in the corpus: [attribute=”value”] where the attribute is positional (word, lemma, tag etc.), the value is the search term itself, or a pattern specified with the help of regular expressions. However, we will focus on CQL in more advanced stages of the course.
 +
 +----
 +
 +**If you are ready, you can continue to [[en:obc:spelling|Lesson 2]].**
 +
 +----