AplikaceAplikace
Nastavení

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
en:cnk:onomos [2023/12/01 15:04] jankoceken:cnk:onomos [2023/12/05 13:54] (current) jankocek
Line 1: Line 1:
 ====== OnomOs Corpus ====== ====== OnomOs Corpus ======
  
-The OnomOs corpus is a linguistically processed database of texts from the periodicals Rudé právo (published 1920–1995) and Právo (1995–present). It always contains one issue from each decade in which (Rudé) Právo was published. The corpus includes texts in which the language component dominates; therefore, not included are, for example, advertisements and classifieds, cinema, theatre and radio programmes, some types of texts from the sports section (e.g. scoreboards and player rosters), comics or crossword puzzles. The structure of the corpus is presented in more detail in Figure 1. In total, the corpus contains 255 149 tokens. +The OnomOs corpus is a linguistically processed database of texts from the periodicals Rudé právo (published 1920–1995) and Právo (1995–present). It always contains one issue from each decade in which (Rudé) Právo was published. The corpus includes texts in which the language component dominates; therefore, not included are, for example, advertisements, cinema, theatre and radio programmes, some types of texts from the sports section (e.g. scoreboards and player rosters), comics or crossword puzzles. The structure of the corpus is presented in more detail in Figure 1. In total, the corpus contains 255 149 tokens. 
  
 [{{:cnk:onomos_graf.png?direct&700|}}] [{{:cnk:onomos_graf.png?direct&700|}}]
Line 10: Line 10:
 ^**Higher-order category\\ (NameTag 2)**^**Lower-order category\\ (NameTag 2)** ^**Lower-order category\\ (OnomOs)**      ^**Higher-order category\\ (OnomOs)**^ ^**Higher-order category\\ (NameTag 2)**^**Lower-order category\\ (NameTag 2)** ^**Lower-order category\\ (OnomOs)**      ^**Higher-order category\\ (OnomOs)**^
 |p - Personal names                    |pf - first names                       |AF: first names                          |Antroponyma (A)                    | |p - Personal names                    |pf - first names                       |AF: first names                          |Antroponyma (A)                    |
-|         :::                          |pm - second names                      |                                         |     :::                               |+|         :::                          |pm - second names                      |       :::                               |     :::                               |
 |         :::                          |pc - inhabitant names                  |AI: inhabitants                          |     :::                              | |         :::                          |pc - inhabitant names                  |AI: inhabitants                          |     :::                              |
 |         :::                          |pp - relig./myth persons               |AM: religious and mythological names         :::                              | |         :::                          |pp - relig./myth persons               |AM: religious and mythological names         :::                              |
Line 16: Line 16:
 |         :::                          |p_ - underspecified                    |AX: underspecified anthroponyms          |     :::                              | |         :::                          |p_ - underspecified                    |AX: underspecified anthroponyms          |     :::                              |
 |g - Geographical names                |gl - nature areas / objects            |TN: nature names                         |Toponyma (T)                       | |g - Geographical names                |gl - nature areas / objects            |TN: nature names                         |Toponyma (T)                       |
-|         :::                          |gh - hydronyms                                                                 |     :::                             |+|         :::                          |gh - hydronyms                                :::                              |     :::                             |
 |         :::                          |gq - urban parts                       |TS: settlements                          |     :::                              | |         :::                          |gq - urban parts                       |TS: settlements                          |     :::                              |
-|         :::                          |gu - cities/towns                      |                                         |     :::                             |+|         :::                          |gu - cities/towns                      |        :::                              |     :::                             |
 |         :::                          |gr - territorial names                 |TT: territories                          |     :::                             | |         :::                          |gr - territorial names                 |TT: territories                          |     :::                             |
-|         :::                          |gt - continents                        |                                         |     :::                              | +|         :::                          |gt - continents                        |        :::                              |     :::                              | 
-|         :::                          |gc - states                            |                                         |     :::                              |+|         :::                          |gc - states                            |        :::                              |     :::                              |
 |         :::                          |gs - streets, squares                  |TU: urbanonyms                               :::                              | |         :::                          |gs - streets, squares                  |TU: urbanonyms                               :::                              |
 |         :::                          |g_ - underspecified                    |TX: underspecified toponyms              |     :::                              | |         :::                          |g_ - underspecified                    |TX: underspecified toponyms              |     :::                              |
Line 42: Line 42:
 The OnomOs corpus was created by researchers of the "Ostrava Onomastic School", which focuses on the implementation of quantitative linguistic methods in the science of proper names within the research of the Department of Czech Language of the Faculty of Arts of the University of Ostrava. The project was supported by the grant project SGS02/FF/2023 //OnomOs - Ostrava Corpus of Proper Names//, which was implemented at the Faculty of Arts, University of Ostrava. The OnomOs corpus was created by researchers of the "Ostrava Onomastic School", which focuses on the implementation of quantitative linguistic methods in the science of proper names within the research of the Department of Czech Language of the Faculty of Arts of the University of Ostrava. The project was supported by the grant project SGS02/FF/2023 //OnomOs - Ostrava Corpus of Proper Names//, which was implemented at the Faculty of Arts, University of Ostrava.
  
-===== How to search for propria in the OnomOs corpus =====+======How to search for propria in the OnomOs corpus======
  
 Proper nouns can be searched in the OnomOs corpus using, for example, the following command in CQL (the lower-order category is indicated in quotation marks): Proper nouns can be searched in the OnomOs corpus using, for example, the following command in CQL (the lower-order category is indicated in quotation marks):
Line 64: Line 64:
 **Figure 3** - distribution of toponym types in the OnomOs corpus.  **Figure 3** - distribution of toponym types in the OnomOs corpus. 
  
-====Citing OnomOs====+======Citing OnomOs====== 
  
 <WRAP round tip 70%> <WRAP round tip 70%>
-David, J. – Davidová Glogarová, J. – Klemensová, T. – Místecký, M. – Jeziorský, T. – Křen, M. – Březinová, K. – Halatová, H. – Mádrová, J. – Pavlištíková, J. – Polášková, K. – Reclik, A. – Strnadlová, M. //Korpus OnomOs//. Ústav Českého národního korpusu FF UK, Praha 2023. Dostupný z WWW: http://www.korpus.cz.+David, J. – Davidová Glogarová, J. – Klemensová, T. – Místecký, M. – Jeziorský, T. – Křen, M. – Březinová, K. – Halatová, H. – Mádrová, J. – Pavlištíková, J. – Polášková, K. – Reclik, A. – Strnadlová, M. //Korpus OnomOs//. Ústav Českého národního korpusu FF UK, Praha 2023. Available on-line: http://www.korpus.cz.
 </WRAP> </WRAP>
  
  
-===References===+======References====== 
   * Karlík, P. – Nekula, M. – Pleskalová, J. (2017, eds.), //Nový encyklopedický slovník češtiny online//. Brno: Masarykova univerzita. Dostupný z WWW: https://www.czechency.org.   * Karlík, P. – Nekula, M. – Pleskalová, J. (2017, eds.), //Nový encyklopedický slovník češtiny online//. Brno: Masarykova univerzita. Dostupný z WWW: https://www.czechency.org.
   * Straková, J. – Straka, M. – Hajič, J. (2019): Neural Architectures for Nested NER through Linearization. In: A. Korhonen – D. Traum – L. Màrquez (eds.), //Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics//. Florencie: Association for Computational Linguistics, s. 5326–5331.   * Straková, J. – Straka, M. – Hajič, J. (2019): Neural Architectures for Nested NER through Linearization. In: A. Korhonen – D. Traum – L. Màrquez (eds.), //Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics//. Florencie: Association for Computational Linguistics, s. 5326–5331.