Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
en:cnk:obc [2020/02/14 11:10] – michalkren | en:cnk:obc [2021/02/10 15:39] (current) – [How to cite] michalkren | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== OBC: The Old Bailey Corpus 2.0 ====== | ====== OBC: The Old Bailey Corpus 2.0 ====== | ||
- | The [[http:// | + | The [[http:// |
The corpus is licensed under [[https:// | The corpus is licensed under [[https:// | ||
- | SEM BYCH DAL OBSAH L0.DOCX, JE TO PĚKNÝ TEXT O KORPUSU JAKO TAKOVÉM. | + | {{:en:obc01.png? |
+ | |||
+ | //Front matter of the Proceedings of the Old Bailey, 18th February 1830, page 2// | ||
+ | ===== The digitalization process ===== | ||
+ | |||
+ | The original pages of the Proceedings were scanned and the scans are now available at [[https:// | ||
+ | |||
+ | The texts were marked-up in XML (Extensible Markup Language) according to the [https:// | ||
+ | |||
+ | Every single //doc// structure represents one proceeding and consists of multiple //text// structures, the first of which is usually the front matter (or else according to the //type// attribute) and the following contain the trial account itself. | ||
+ | |||
+ | Each text of the OBC is annotated for its metainformation, | ||
+ | |||
+ | In the trial account, the direct speeches are tagged for individual // | ||
+ | |||
+ | Single words are assigned a part-of-speech (POS) tags according to the [[http:// | ||
+ | |||
+ | Please note that we have changed some of the tagging of the original corpus by Huber, Nissel and Puga. In the original data, some of the attributes such as offences, verdicts marked those parts of the proceedings that spelled them out. For example, when the text noted that a particular defendant was charged with murder, the word murder or the sentence containing it would be tagged as an offence with an attribute of murder. We copied these attributes to //text// making it much easier to form queries such as “find all adjectives spoken by female defendants in trials concerned with murder and ending in acquittal”, | ||
+ | |||
+ | {{: | ||
+ | |||
+ | Trials 652-5 in Proceedings of the Old Bailey, 18th February 1830, page 73 | ||
+ | |||
+ | |||
+ | |||
===== Wiki course ===== | ===== Wiki course ===== | ||
Line 11: | Line 36: | ||
For a basic overview of how to use the OBC corpus and how to input the data into the search interface check our wiki-course in eight lessons: | For a basic overview of how to use the OBC corpus and how to input the data into the search interface check our wiki-course in eight lessons: | ||
- | * [[en:eebo:first_query|Lesson 1 (First query)]] | + | * [[en:obc:query_types|Lesson 1 (Query types)]] |
- | * [[en:eebo:orthography_spelling|Lesson 2 (Orthography and Spelling)]] | + | * [[en:obc:spelling|Lesson 2 (Spelling)]] |
- | * [[en:eebo:competing_forms|Lesson 3 (Competing forms)]] | + | * [[en:obc:spell2|Lesson 3 (Spelling variation continued)]] |
- | * [[en:eebo:specify_query|Lesson 4 (Specify query)]] | + | * [[en:obc:spell3|Lesson 4 (Spelling III: Searching with tags)]] |
- | * [[en:eebo:collocations|Lesson 5 (Collocations)]] | + | * [[en:obc:intro_to_metadata|Lesson 5 (Introduction to metadata)]] |
- | * [[en:eebo:morphology1|Lesson 6 (Morphology I)]] | + | * [[en:obc:specific_query|Lesson 6 (Specify query: Metadata continued))]] |
- | * [[en:eebo:morphology2|Lesson 7 (Morphology II)]] | + | * [[en:obc:frequency_distribution|Lesson 7 (Two-attribute interrelationship frequency distribution)]] |
- | * [[en:eebo:multiword|Lesson 8 (Multiword expressions)]] | + | * [[en:obc:collocations|Lesson 8 (Collocations)]] |
===== How to cite ===== | ===== How to cite ===== | ||
<WRAP round tip 70%> | <WRAP round tip 70%> | ||
- | //OBC: The Old Bailey Corpus 2.0//. Ústav Českého národního korpusu FF UK, Prague | + | //OBC: The Old Bailey Corpus 2.0//. Ústav Českého národního korpusu FF UK, Prague |
**The original Old Bailey Corpus**: Huber, M. - Nissel, M. - Puga, K. (2016): //Old Bailey Corpus 2.0//. [[http:// | **The original Old Bailey Corpus**: Huber, M. - Nissel, M. - Puga, K. (2016): //Old Bailey Corpus 2.0//. [[http:// |