AplikaceAplikace
Nastavení

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Next revisionBoth sides next revision
en:manualy:kontext:novy_dotaz [2016/11/08 16:41] – [Type of Query] veronikapojarovaen:manualy:kontext:novy_dotaz [2016/11/08 16:57] veronikapojarova
Line 26: Line 26:
 ^ Query type ^ What it’s for ^ How it works ^ What it does ^ Examples ^ ^ Query type ^ What it’s for ^ How it works ^ What it does ^ Examples ^
 ^ Basic Query | for familiarization with the corpus| Searches for the expression as a node form regardless of case; in the case of a dictionary form (lemma), all of its possible forms are also searched for. | without [[en:pojmy:regularni_vyrazy|regular expressions]] (RE), [[wp>Case_sensitivity|case-insensitive]] | ''old house'' > //old house, older houses, oldest house…//\\ ''this time'' > //this time// | ^ Basic Query | for familiarization with the corpus| Searches for the expression as a node form regardless of case; in the case of a dictionary form (lemma), all of its possible forms are also searched for. | without [[en:pojmy:regularni_vyrazy|regular expressions]] (RE), [[wp>Case_sensitivity|case-insensitive]] | ''old house'' > //old house, older houses, oldest house…//\\ ''this time'' > //this time// |
-^ Lemma | for the analysis of an entire paradigm/lexeme | Finds all forms associated with the given [[en:pojmy:lemma|lemma]]. | RE (it is possible to use regular expressions), case-sensitive, possibility of specifying word class | ''see'' > //see, saw, seen, seeing…//\\ ''new'' > //new, newer, newest…// |+^ Lemma | for the analysis of an entire paradigm/lexeme | Finds all forms associated with the given [[wp>Lemma_(psycholinguistics)|lemma]]. | RE (it is possible to use regular expressions), case-sensitive, possibility of specifying word class | ''see'' > //see, saw, seen, seeing…//\\ ''new'' > //new, newer, newest…// |
 ^ Phrase | for a multiword combination in the given form | Finds the exact wording of a phrase. | RE, case-sensitive | ''black dog'' > //black dog//\\ ''new car'' > //new car//\\ ''newer car'' > //newer car// | ^ Phrase | for a multiword combination in the given form | Finds the exact wording of a phrase. | RE, case-sensitive | ''black dog'' > //black dog//\\ ''new car'' > //new car//\\ ''newer car'' > //newer car// |
 ^ Node form | for the analysis of one specific form | Finds the exact form. | RE, case-in/sensitive (possible to select Match case) | ''cat'' > //cat//\\ ''cats'' > //cats//\\ ''cat.*'' > //cat, cats, Cats, CATS…// | ^ Node form | for the analysis of one specific form | Finds the exact form. | RE, case-in/sensitive (possible to select Match case) | ''cat'' > //cat//\\ ''cats'' > //cats//\\ ''cat.*'' > //cat, cats, Cats, CATS…// |
Line 35: Line 35:
  
 **Corpus selection** and **query type** can influence what the form looks like:  **Corpus selection** and **query type** can influence what the form looks like: 
-  - Corpora which are not lemmatized, i.e. do not offer //lemma// as a [[en:kurz:prvni_dotaz#typy_dotazu|query type]].+  - Corpora which are not lemmatized, i.e. do not offer //lemma// as a [[en:eebo:first_query#query_types|query type]].
   - Some query types (only those where it makes sense) allow for the user to specify whether the query should be assessed with respect to capitalization ([[wp>Case_sensitivity|case-sensitive]]), or without considering upper/lower case ([[wp>Case_sensitivity|case-insensitive]]).   - Some query types (only those where it makes sense) allow for the user to specify whether the query should be assessed with respect to capitalization ([[wp>Case_sensitivity|case-sensitive]]), or without considering upper/lower case ([[wp>Case_sensitivity|case-insensitive]]).
-  - In the case of the [[en:kurz:prvni_dotaz#typy_dotazu|query types]] [[en:pojmy:lemma]] and [[en:pojmy:word]] it is also possible to specify word class (position attribute [[en:pojmy:pos|pos]]).+  - In the case of the [[en:eebo:first_query#query_types|query types]] [[wp>Lemma_(psycholinguistics):lemma]] and [[en:pojmy:word]] it is also possible to specify word class (position attribute [[en:pojmy:pos|pos]]).
   - The [[en:pojmy:dotazovaci_jazyk|CQL]] query type also allows for the insertion of interactively generated [[en:pojmy:tag|morphological tags]] (with corpora which are tagged in this way) or conditions specifying texts in which the search is to be carried out (the condition [[en:pojmy:within]]).   - The [[en:pojmy:dotazovaci_jazyk|CQL]] query type also allows for the insertion of interactively generated [[en:pojmy:tag|morphological tags]] (with corpora which are tagged in this way) or conditions specifying texts in which the search is to be carried out (the condition [[en:pojmy:within]]).
   - A very specific way of inputting queries for [[en:kurz:hledani_v_paralelnim_korpusu|searches in parallel corpora]].   - A very specific way of inputting queries for [[en:kurz:hledani_v_paralelnim_korpusu|searches in parallel corpora]].
Line 76: Line 76:
 ====== Word list ====== ====== Word list ======
  
-The basic output of any query is a [[en:pojmy:konkordance|concordance]], i.e. a list of all the occurrences ([[en:pojmy:token|tokens]]) matching the query, along with their text surroundings. The **Word list** function evaluates the query in such a way that the result is a list of various words ([[en:pojmy:typ|types]]), matching the query, together with their absolute [[en:pojmy:frekvence|frequency]], [[en:pojmy:arf|ARF]] or number of documents in which the wanted phenomenon occurs. In this respect, the Word list function is analogous to [[en:manualy:kontext:frekvencni_distribuce|frequency distribuction]], however its advantage is its speed and low computational complexity, because the extra step involving the concordance is not needed with the Word list.+The basic output of any query is a [[en:pojmy:konkordance|concordance]], i.e. a list of all the occurrences ([[en:pojmy:token|tokens]]) matching the query, along with their text surroundings. The **Word list** function evaluates the query in such a way that the result is a list of various words ([[en:pojmy:typ|types]]), matching the query, together with their absolute [[en:pojmy:frekvence|frequency]], [[en:pojmy:arf|ARF]] or number of documents in which the wanted phenomenon occurs. In this respect, the Word list function is analogous to [[en:manualy:kontext:frekvencni_distribuce|frequency distribution]], however its advantage is its speed and low computational complexity, because the extra step involving the concordance is not needed with the Word list.
  
 [{{ en:manualy:kontext:seznam_slov_slovesa.png?direct&300|Form for creating word lists}}] [{{ en:manualy:kontext:seznam_slov_slovesa.png?direct&300|Form for creating word lists}}]
Line 82: Line 82:
 Various search parameters can be set in the form: Various search parameters can be set in the form:
   * corpus (or its subcorpus), in which the word list will be created   * corpus (or its subcorpus), in which the word list will be created
-  * attribute ([[en:pojmy:atributy_pozicni|positional]] nebo [[en:pojmy:atributy_strukturni|structural]]), which is to be included in the list +  * attribute ([[en:pojmy:atributy_pozicni|positional]] or [[en:pojmy:atributy_strukturni|structural]]), which is to be included in the list 
-  * RE pattern (regular expression), to which the resulting words must correspond (if it is not submitted, the list will contain all items in the corpus if they fulfil the other specifications in the form)+  * RE pattern (regular expression), to which the resulting words must correspond (if it is not submitted, the list will contain all items in the corpus if they fulfill the other specifications in the form)
   * minimum frequency   * minimum frequency
   * whitelist – a list of pre-selected words (in a separate file) which we want to see in the resulting list   * whitelist – a list of pre-selected words (in a separate file) which we want to see in the resulting list
Line 89: Line 89:
   * option "Include non-words", which widens the search to words which are not composed only of alphabetic characters   * option "Include non-words", which widens the search to words which are not composed only of alphabetic characters
  
-Among the output option settings we can find a selection of either the absolute [[en:pojmy:frekvence|frequency]], [[en:pojmy:arf|ARF]] or a document count. Furthermore there is also the possibility to choose a specific output attribute (or attributes). These attributes **need not be** identical to the positional attribute selected in the top section of the form, on which all the above mentioned filters are applied. This enables us to create e.g. a frequency list of all verbs by selecting the attribute [[en:pojmy:tag|tag]] in the top section, applying the condition for a verb as in [[en:seznamy:tagy#pozice_1_-_slovni_druh|V.*]] and finally by "switching" the output type to [[en:pojmy:lemma|lemma]] – an example of such a query is shown in the picture.+Among the output option settings we can find a selection of either the absolute [[en:pojmy:frekvence|frequency]], [[en:pojmy:arf|ARF]] or a document count. Furthermore there is also the possibility to choose a specific output attribute (or attributes). These attributes **need not be** identical to the positional attribute selected in the top section of the form, on which all the above mentioned filters are applied. This enables us to create e.g. a frequency list of all verbs by selecting the attribute [[en:pojmy:tag|tag]] in the top section, applying the condition for a verb as in [[en:seznamy:tagy#pozice_1_-_slovni_druh|V.*]] and finally by "switching" the output type to [[wp>Lemma_(psycholinguistics)|lemma]] – an example of such a query is shown in the picture.
  
 ---- ----