AplikaceAplikace
Nastavení

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
en:obc:spelling [2020/02/18 13:42] michalskrabalen:obc:spelling [2020/02/27 12:19] (current) jankocek
Line 9: Line 9:
 |**Description**                                 |**Examples**                                                        | |**Description**                                 |**Examples**                                                        |
 |//-ic// spelled as //-ick//                     |//public(k), catholic(k), music(k), magic(k)//                      | |//-ic// spelled as //-ick//                     |//public(k), catholic(k), music(k), magic(k)//                      |
-|'//'-ed­ //in past tense and participles as //‘d//|<html><u></html>//call’d, cry’d, confess’d, ask’d//<html></u></html>|+|//-ed­ //in past tense and participles as //‘d// |//call’d, cry’d, confess’d, ask’d// |
 |//or// //our//variation in BrE                 |//favo(u)r, hono(u)r, colo(u)r, labo(u)r//                          | |//or// //our//variation in BrE                 |//favo(u)r, hono(u)r, colo(u)r, labo(u)r//                          |
 |//s// //z //                                    |//surpris/ze//, //recognis/ze//, //apologis/ze// //cruis/ze//       | |//s// //z //                                    |//surpris/ze//, //recognis/ze//, //apologis/ze// //cruis/ze//       |
Line 18: Line 18:
 To search for multiple forms, set the query type to [[https://wiki.korpus.cz/doku.php/en:pojmy:dotazovaci_jazyk|CQL]]. Now let’s try and search for all the variants of the word //public//. According to the OED, there were multiple forms found during the 18<sup>th</sup> and 19<sup>th</sup> century: //public//, //publik//, //publick.// To search for multiple forms, set the query type to [[https://wiki.korpus.cz/doku.php/en:pojmy:dotazovaci_jazyk|CQL]]. Now let’s try and search for all the variants of the word //public//. According to the OED, there were multiple forms found during the 18<sup>th</sup> and 19<sup>th</sup> century: //public//, //publik//, //publick.//
  
-When using CQL, each element of the query has to be enclosed in square brackets []. The type of search we intend to conduct is specified by the attribute; to search for lemmas, type //lemma=//, for tags, type //tag=//, etc. In this case, you are looking for specific word forms, so //word=// should be used. The specific search items (words, lemmas, tags etc.) must be inserted into quotation marks "". For example, to search for the word //public//, type:+When using CQL, each element of the query has to be enclosed in square brackets []. The type of search we intend to conduct is specified by the attribute; to search for lemmas, type ''lemma='', for tags, type ''tag='', etc. In this case, you are looking for specific word forms, so ''word='' should be used. The specific search items (words, lemmas, tags etc.) must be inserted into quotation marks ''""''. For example, to search for the word //public//, type:
  
 ''[word="public"]'' ''[word="public"]''
  
-Searching for all of the three forms mentioned above simultaneously requires the use of the pipe symbol | which functions as an OR operator:+Searching for all of the three forms mentioned above simultaneously requires the use of the pipe symbol ''|'' which functions as an OR operator:
  
 ''[word="public" | word="publik" | word="publick"]'' ''[word="public" | word="publik" | word="publick"]''
  
-* Searches for //public// OR //publik //OR //publick//+(searches for //public// OR //publik //OR //publick//)
  
 You need to keep in mind that CQL is case-sensitive, therefore, to find all occurrences of these words regardless of capitalization, it is necessary to add the forms with capital letters. For this operation, insert another set of square brackets into the value in quotation marks; the items within the square brackets form a set from which one item is selected: You need to keep in mind that CQL is case-sensitive, therefore, to find all occurrences of these words regardless of capitalization, it is necessary to add the forms with capital letters. For this operation, insert another set of square brackets into the value in quotation marks; the items within the square brackets form a set from which one item is selected:
Line 32: Line 32:
 ''[word="[Pp]ublic" | word="[Pp]ublik" | word="[Pp]ublick"]'' ''[word="[Pp]ublic" | word="[Pp]ublik" | word="[Pp]ublick"]''
  
-Alternatively, you can also use the specific sequence of characters (?i), which, when used right after the quotation marks, makes the whole query case-insensitive:+Alternatively, you can also use the specific sequence of characters ''(?i)'', which, when used right after the quotation marks, makes the whole query case-insensitive:
  
 ''[word="(?i)public" | word="(?i)publik" | word="(?i)publick"]'' ''[word="(?i)public" | word="(?i)publik" | word="(?i)publick"]''
Line 44: Line 44:
 If you wish to condense the query, simply combine what you have learned in the previous steps in the following manner: If you wish to condense the query, simply combine what you have learned in the previous steps in the following manner:
  
-''[word="(?i)publi[ck]k?e?"]+''[word="(?i)publi[ck]k?e?"]''
  
-The whole search is case-insensitive, and contains all the forms which were previously inputted separately; the initial sequence //publi// is present in all of them, it is followed by either //c// or //k//, the subsequent character //k// is optional (it would most likely occur after //c//), and the final //e// is also marked as optional.+The whole search is case-insensitive, and contains all the forms which were previously inputted separately; the initial sequence //publi// is present in all of them, it is followed by either //c// or //k//, the subsequent character //k// is optional (it would most likely occur after //c//), and the final //e// is also marked as optional.
  
-<html><u></html>Task:<html></u></html>* What should be the query to find all possible spellings of the noun breeches?+<WRAP round help 40%> 
 +**Task:**  
 + 
 +What should be the query to find all possible spellings of the noun //breeches//? 
 +</WRAP>
  
 After consulting the dictionary, you may expect the following forms: //breeches//, //breaches//, //brieches//, //briches//, //breetches//, //britches//. After consulting the dictionary, you may expect the following forms: //breeches//, //breaches//, //brieches//, //briches//, //breetches//, //britches//.
Line 56: Line 60:
 ''[word="(?i)"]'' ''[word="(?i)"]''
  
-The first two characters //br// should be present in all forms, however the following vowels do display some degree of variation. The first vowel, according to the OED, alternates between //e// and //i//, so it is necessary to enclose these two characters in square brackets [ei].+The first two characters //br// should be present in all forms, however the following vowels do display some degree of variation. The first vowel, according to the OED, alternates between //e// and //i//, so it is necessary to enclose these two characters in square brackets ''[ei]''.
  
 ''[word="(?i)br[ei]"]'' ''[word="(?i)br[ei]"]''
  
-The next vowel appears to be either //e// or //a//, however it is optional (see //briches//) – [ea] followed by the question mark ? to signal optionality.+The next vowel appears to be either //e// or //a//, however it is optional (see //briches//) – ''[ea]'' followed by the question mark ? to signal optionality.
  
 ''[word="(?i)br[ei][ea]?"]'' ''[word="(?i)br[ei][ea]?"]''
Line 68: Line 72:
 ''[word="(?i)br[ei][ea]?t?ch"]'' ''[word="(?i)br[ei][ea]?t?ch"]''
  
-All the forms end with final //s// and according to the OED, it is always preceded by //e//. However, to make sure we search for all the possible variants occurring in the OBC, we may want to use some regular expressions (more on this in [[en:obc:spell2|Lesson 3]]) to mark the possibility of other characters appearing. The plural ending might have been spelt in various ways, so it is recommended to employ the sequence [[https://wiki.korpus.cz/doku.php/en:pojmy:regularni_vyrazy|.*]] (see [[en:obc:spell3|Lesson 4]]) which represents any sequence of characters (or none). The final query should then look like this:+All the forms end with final //s// and according to the OED, it is always preceded by //e//. However, to make sure we search for all the possible variants occurring in the OBC, we may want to use some regular expressions (more on this in [[en:obc:spell2|Lesson 3]]) to mark the possibility of other characters appearing. The plural ending might have been spelt in various ways, so it is recommended to employ the sequence ''.*'' (see [[en:obc:spell3|Lesson 4]]) which represents any sequence of characters (or none). The final query should then look like this:
  
 ''[word="(?i)br[ei][ea]?t?ch.*s"]'' ''[word="(?i)br[ei][ea]?t?ch.*s"]''
  
-To view the list of all the variants which occur in the corpus, click on **Frequency → Node forms [A=a]**.+To view the list of all the variants which occur in the corpus, click on //Frequency → Node forms [A=a]//.
  
-Note the forms which were not included in the list available in the OED: //breechees//, //breachings// and //breches//. You will learn how to work further with the frequency list in the following [[en:obc:spell2|Lesson 3]].+Note the forms which were not included in the list available in the OED: //breechees//, //breachings// and //breches//.
  
 {{:en:obc:l2_1a.png?direct&|}} {{:en:obc:l2_1a.png?direct&|}}
  
 +----
 +
 +**If you are ready, you can continue to [[en:obc:spell2|Lesson 3]].**
 +
 +----