AplikaceAplikace
Nastavení

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
en:pojmy:dotazovaci_jazyk [2016/11/08 17:52] – created veronikapojarovaen:pojmy:dotazovaci_jazyk [2020/12/21 19:05] (current) – [Query language used in ČNK] michalkren
Line 7: Line 7:
 ====== Query language used in ČNK ====== ====== Query language used in ČNK ======
  
-The query language used in the ČNK corpora operating on the corpus manager [[en:pojmy:korpusovy_manazer#manatee|Manatee]] is called **[[https://www.sketchengine.co.uk/corpus-querying/|CQL (corpus query language)]]** and is in fact a modified version of the original CQL created for the corpus manager [[en:pojmy:korpusovy_manazer#cwb|CWB]]. Its cornerstone is a query for a single position (word) in the corpus:+The query language used in the ČNK corpora operating on the corpus manager [[en:pojmy:korpusovy_manazer#manatee|Manatee]] is called **[[https://www.sketchengine.eu/documentation/corpus-querying/|CQL (corpus query language)]]** and is in fact a modified version of the original CQL created for the corpus manager [[en:pojmy:korpusovy_manazer#cwb|CWB]]. Its cornerstone is a query for a single position (word) in the corpus:
  
 ''[attribute=<nowiki>"</nowiki>value<nowiki>"</nowiki>]'' ''[attribute=<nowiki>"</nowiki>value<nowiki>"</nowiki>]''
Line 13: Line 13:
 where the attribute is [[en:pojmy:atributy_pozicni|positional]] (word, lemma, tag etc.), the value is the search term itself, or a pattern specified with the help of [[en:pojmy:regularni_vyrazy|regular expressions]]. The query can also include limitations on [[en:pojmy:atributy_strukturni|structural attributes]] (sentence, doc, opus), where it is also possible to specify other values (e.g.  for opuses it is the publication year, genre, author etc.). Limitations for structural attributes are, unlike those for positional attributes, written in  [[en:kurz:zobrazeni_dotazu#jak_vypada_tzv_vertikala|in pointed brackets]] (e.g. ''<s id=%%"10"%%/>''); see a more detailed and complete description of the [[https://www.sketchengine.co.uk/corpus-querying/|CQL ]]. CQL is a formal language which has a precise (and finite) definition. CQL supports some elements of traditional [[en:pojmy:regularni_vyrazy|regular]] languages ((E.g. quantificators, round brackets and logical operators.)), but it also supports expanded, specifically corpus-related commands such as ''[[en:pojmy:within|within]]'', ''[[en:pojmy:meet|meet]]'', ''[[en:pojmy:union|union]]'' or ''[[en:pojmy:containing|containing]]'', which work with the structure of the corpus. where the attribute is [[en:pojmy:atributy_pozicni|positional]] (word, lemma, tag etc.), the value is the search term itself, or a pattern specified with the help of [[en:pojmy:regularni_vyrazy|regular expressions]]. The query can also include limitations on [[en:pojmy:atributy_strukturni|structural attributes]] (sentence, doc, opus), where it is also possible to specify other values (e.g.  for opuses it is the publication year, genre, author etc.). Limitations for structural attributes are, unlike those for positional attributes, written in  [[en:kurz:zobrazeni_dotazu#jak_vypada_tzv_vertikala|in pointed brackets]] (e.g. ''<s id=%%"10"%%/>''); see a more detailed and complete description of the [[https://www.sketchengine.co.uk/corpus-querying/|CQL ]]. CQL is a formal language which has a precise (and finite) definition. CQL supports some elements of traditional [[en:pojmy:regularni_vyrazy|regular]] languages ((E.g. quantificators, round brackets and logical operators.)), but it also supports expanded, specifically corpus-related commands such as ''[[en:pojmy:within|within]]'', ''[[en:pojmy:meet|meet]]'', ''[[en:pojmy:union|union]]'' or ''[[en:pojmy:containing|containing]]'', which work with the structure of the corpus.
  
-A simultaneous query for more than one position (i.e. word sequence or wider context) is formed simply by the concatenation of the individual queries for each successive position. E.g. the query ''[lemma=<nowiki>"</nowiki>mít<nowiki>"</nowiki>][][lemma=<nowiki>"</nowiki>srdce<nowiki>"</nowiki>]'' searches for all occurrences of the lemmas //mít// //srdce//, in between which there is one position (i.e. word or interpunction).+A simultaneous query for more than one position (i.e. word sequence or wider context) is formed simply by the concatenation of the individual queries for each successive position. E.g. the query ''[lemma=<nowiki>"</nowiki>have<nowiki>"</nowiki>][][lemma=<nowiki>"</nowiki>heart<nowiki>"</nowiki>]'' searches for all occurrences of the lemmas //have// and //heart//, in between which there is one position (i.e. word or punctuation).
  
-Následující příklad dotazovacího jazyka korpusového manažeru Manatee najde všechny doklady spojení typu bez chuti a bez zápachu“, „bez práce, bez peněz“ apodvyskytující se v korpusu uvnitř jedné věty (struktura ''<s/>'', viz [[pojmy:atributy_strukturni|strukturní atributy]]): +The following example of the Manatee corpus manager's query language will find all instances of the construction type neither woman nor man“, „neither man nor beast“ etcoccurring in the corpus within one sentence (structure''<s/>'', see [[en:pojmy:atributy_strukturni|structural attributes]]): 
  
-''[lemma=<nowiki>"</nowiki>bez<nowiki>"</nowiki>] [tag=<nowiki>"</nowiki>N.*<nowiki>"</nowiki>] []{0,1} [lemma=<nowiki>"</nowiki>bez<nowiki>"</nowiki>] [tag=<nowiki>"</nowiki>N.*<nowiki>"</nowiki>] within <s/>''+''[lemma=<nowiki>"</nowiki>neither<nowiki>"</nowiki>] [tag=<nowiki>"</nowiki>N.*<nowiki>"</nowiki>] []{0,1} [lemma=<nowiki>"</nowiki>nor<nowiki>"</nowiki>] [tag=<nowiki>"</nowiki>N.*<nowiki>"</nowiki>] within <s/>''
  
-Každou pozici v sekvenci zde zastupuje jedna hranatá závorkapřípadně doplněná kvantifikátorem ve složených závorkáchPrvní pozici vyhovují všechna slova lemmatizovaná jako "bez", druhé pozici vyhovují všechna substantiva (tedy slovní tvary opatřené morfologikých tagem začínajícím písmenem "N" za nímž následuje libovolná sekvence libovolných znaků), třetí pozici vyhovuje libovolné jedno (či žádnéslovočtvrtá pozice je opět omezena lemmatem "bez", pátá opět pouze morfologickou značkou substantivaDirektiva "within" omezuje celý dotaz na rámec jednoho strukturního atributu typu "<s/>" (tedy jedné věty). Pro tento účel lze využít též direktivu ''containing''.+Each position in the sequence is represented by one pair of square bracketspossibly accompanied by a quantifier in curly bracketsThe first position represents all words lemmatized as "neither", the second position represents all nouns (word forms containing a morphological tag beginning with the letter "N", followed by an arbitrary sequence of arbitrary characters), the third position is occupied by any one word (or none), the fourth position is limited to the lemma "nor", and the fifth position once again contains the morphological tag for nounsThe directive "within" limits the entire query within the scope of one structural attribute "<s/>" (i.e. one sentence). It is also possible to use the directive ''containing'' for this particular purpose.
  
-Při práci s korpusovým manažerem je vhodné znát použitý dotazovací jazyk jeho možnostiAčkoli některá uživatelská rozhraní umožňují zadávat dotaz i bez znalosti konkrétního dotazovacího jazykabývají v tomto případě možnosti práce s korpusem omezeny, což je dáno snahou o uživatelský komfort srozumitelnost rozhraníkterá je vždy na úkor plného využití bohatých kombinací možností vyhledávání.+For work with corpus manager it is advisable to know the query language used and the possibilities it offersAlthough some user interfaces make it possible to input queries without knowledge of the specific query languagethe possibilities of working with such an interface tend to be somewhat limited. This is result of the effort to make the interface user-friendly and as comprehensible as possiblewhich is always achieved at the expense of the possibilities and combinations available to the user.
  
 ==== Relevant links ==== ==== Relevant links ====