Both sides previous revisionPrevious revisionNext revision | Previous revision |
en:obc:specific_query [2020/02/17 16:43] – jankocek | en:obc:specific_query [2020/02/27 12:23] (current) – jankocek |
---|
In this lesson, we will look at how you can use the KonText interface to specify or limit the query based on the metadata before the search itself is initiated. | In this lesson, we will look at how you can use the KonText interface to specify or limit the query based on the metadata before the search itself is initiated. |
| |
Due to the complexity of the trial proceedings, there is a number of complicating factors which may affect your search and results. One of them is that a single trial may involve more than one defendant and hence more than one offence, verdict, and punishment. Moreover, it is not always apparent from the text of the proceedings, which defendant was speaking at a particular moment (the utterances are often marked simply by the role of the participant as in D for defendant) and different defendants may have started with different offences or ended with different verdicts & punishments. Therefore, it was not possible to mark individual utterances with attributes pertaining to these attributes. The way this was dealt with is that individual proceedings (texts) rather than utterances (that appear in those texts) were marked by these attributes and separate categories which contain combinations of these elements were created. Therefore, you may, for example, encounter a text attribute offenceCategory with values of these offence categories such as “breakingPeace | theft | violentTheft,” “kill | theft | violentTheft” and so on, meaning the particular proceeding concerns all these types of offences. It is important to keep this in mind when inputting your query. | Due to the complexity of the trial proceedings, there is a number of complicating factors which may affect your search and results. One of them is that a single trial may involve more than one defendant and hence more than one offence, verdict, and punishment. Moreover, it is not always apparent from the text of the proceedings, which defendant was speaking at a particular moment (the utterances are often marked simply by the role of the participant as in D for defendant) and different defendants may have started with different offences or ended with different verdicts and punishments. Therefore, it was not possible to mark individual utterances with attributes pertaining to these attributes. The way this was dealt with is that individual proceedings (texts) rather than utterances (that appear in those texts) were marked by these attributes and separate categories which contain combinations of these elements were created. Therefore, you may, for example, encounter a text attribute //offenceCategory// with values of these offence categories such as //breakingPeace | theft | violentTheft//, //kill | theft | violentTheft// and so on, meaning the particular proceeding concerns all these types of offences. It is important to keep this in mind when inputting your query. |
| |
**Searching the corpus** | **Searching the corpus** |
| |
To search the corpus using a specified query, open the KonText interface and make sure you have the OBC selected. You can choose any query type – for this lesson, let’s use the basic type. Let’s say we are interested in the language of women in the 19<sup>th</sup> century, who were convicted of theft and either transported or sentenced to death, and we would like to know the frequency of interrogative sentences used by them. To find exclamative sentences, simply type //?// into the search box. To specify the characteristics of the utterances we are looking for, click on **Restrict search**. | To search the corpus using a specified query, open the KonText interface and make sure you have the OBC selected. You can choose any query type – for this lesson, let’s use the basic type. Let’s say we are interested in the language of women in the 19<sup>th</sup> century, who were convicted of theft and either transported or sentenced to death, and we would like to know the frequency of interrogative sentences used by them. To find exclamative sentences, simply type ''?'' into the search box. To specify the characteristics of the utterances we are looking for, click on //Restrict search//. |
| |
{{:en:obc:l6_1.png?direct&500|}} | {{:en:obc:l6_1.png?direct&500|}} |
| |
Now you can limit your search by ticking the appropriate boxes. The numbers situated in the right column indicate how many positions (tokens) fall within the given category (e.g. the //advertisements// texts amount to 125 453 tokens). | Now you can limit your search by ticking the appropriate boxes. The numbers situated in the right column indicate how many positions (tokens) fall within the given category (e.g. the //Advertisements// texts amount to 125,453 tokens). |
| |
There are five types of texts in the OBC: | There are five types of texts in the OBC: |
* **Trial account**: includes information about the defendants, witnesses, victims, descriptions of the crimes and transcriptions of the testimonies | * **Trial account**: includes information about the defendants, witnesses, victims, descriptions of the crimes and transcriptions of the testimonies |
| |
Firstly, to get the actual utterances of the defendants, it is necessary to select the **trialAccount** category only. As we are interested in the language of the 19<sup>th</sup> century, you need to delimit the given time span in the '''text.year '''box. In the **text.offenceCategory** box you will find many different combinations of [[https://www.oldbaileyonline.org/static/Crimes.jsp|offences]] and, as it was mentioned above, you need to be careful when making your selection. Multiple offences divided by the vertical bar indicate that there were multiple defendants present at the trial and to distinguish which person committed which crime and what was their punishment can be quite a demanding task, as it would be necessary to go through each trial account individually and read the transcription. | Firstly, to get the actual utterances of the defendants, it is necessary to select the //trialAccount// category only. As we are interested in the language of the 19<sup>th</sup> century, you need to delimit the given time span in the //text.year// box. In the //text.offenceCategory// box you will find many different combinations of [[https://www.oldbaileyonline.org/static/Crimes.jsp|offences]] and, as it was mentioned above, you need to be careful when making your selection. Multiple offences divided by the vertical bar indicate that there were multiple defendants present at the trial and to distinguish which person committed which crime and what was their punishment can be quite a demanding task, as it would be necessary to go through each trial account individually and read the transcription. |
| |
So, to make sure you include only the people convicted of committing the crime of theft, select the options which include only theft. Here, you have a number of choices: either //theft//, //violentTheft// or //theft | violentTheft//. Selecting all will still ensure including only people convicted of theft in your search. However, when the other categories which include //theft// (e.g. //deception | sexual | theft//) are left out, the search will not consist of **all** the trials which deal with the offence of theft. | So, to make sure you include only the people convicted of committing the crime of theft, select the options which include only theft. Here, you have a number of choices: either //theft//, //violentTheft// or //theft | violentTheft//. Selecting all will still ensure including only people convicted of theft in your search. However, when the other categories which include //theft// (e.g. //deception | sexual | theft//) are left out, the search will not consist of //all// the trials which deal with the offence of theft. |
| |
{{:en:obc:l6_2.png?direct&800|}} | {{:en:obc:l6_2.png?direct&800|}} |
| |
Next, the [[https://www.oldbaileyonline.org/static/Punishment.jsp|punishment]] needs to be selected. Find the '''text.punishmentCategory '''box and select //death//, //death | transport// and //transport// (for more information on offences, verdicts and punishments, see [[https://www.oldbaileyonline.org/static/History.jsp|here]]). You also need to select the role of the utterance speaker, so as not to include utterances spoken by, for example, the judge. Go to the **utterance.speaker_role** box and select //Defendant//. Lastly, find the **utterance.speaker_sex** box and select //f// (female). You can delimit your search further by modifying any of the categories available. When you are satisfied with your selection, hit the search button. You can view the '''Text types **(Frequency → Text Types)** '''frequency list to see all variables, including those which you did not specify in you query. | Next, the [[https://www.oldbaileyonline.org/static/Punishment.jsp|punishment]] needs to be selected. Find the //text.punishmentCategory// box and select //death//, //death | transport// and //transport// (for more information on offences, verdicts and punishments, see [[https://www.oldbaileyonline.org/static/History.jsp|here]]). You also need to select the role of the utterance speaker, so as not to include utterances spoken by, for example, the judge. Go to the //utterance.speaker_role// box and select //Defendant//. Lastly, find the //utterance.speaker_sex// box and select //f// (female). You can delimit your search further by modifying any of the categories available. When you are satisfied with your selection, hit the search button. You can view the Text types frequency list (//Frequency → Text Types//) to see all variables, including those which you did not specify in your query. |
| |
| <WRAP round help 40%> |
**Task**: | **Task**: |
* Find all occurrences of the word ''God ''and combine the following parameters: | * Find all occurrences of the word //God// and combine the following parameters: |
* Spoken during the trial | * spoken during the trial |
* Only in the 18th century | * only in the 18th century |
* The defendant was found guilty | * the defendant was found guilty |
* Spoken by the victim or witness | * spoken by the victim or witness |
* Spoken by a male | * spoken by a male |
* View the Text Types frequency lists and see whether the speaker comes from a high or low class environment, and which publisher was most frequently responsible for the publishing of these specific proceedings. | * View the Text Types frequency lists and see whether the speaker comes from a high or low class environment, and which publisher was most frequently responsible for the publishing of these specific proceedings. |
* Did the victims or the witnesses use the word more often? | * Did the victims or the witnesses use the word more often? |
| </WRAP> |
| |
**[[https://kontext.korpus.cz/view?q=~nUvVP3Fs9cyn|Solution]]**: | You can find solution [[en:obc:solution#lesson_6|here]]. |
| |
{{:en:obc:l6_4.png?direct&300|}} | |
| |
{{:en:obc:l6_3.png?direct&300|}} | ---- |
| |
{{:en:obc:l6_5.png?direct&300|}} | **If you are ready, you can continue to [[en:obc:frequency_distribution|Lesson 7]].** |
| |
| ---- |