~~NOTOC~~

====== Jiří Milička ======

  *[[https://scholar.google.com/citations?hl=en&user=71SLjBMAAAAJ|Google scholar]], [[https://www.researchgate.net/profile/Jiri-Milicka|ResearchGate]]
  *[[https://www.scopus.com/authid/detail.uri?authorId=55946511400|Scopus]], ORCID: [[https://orcid.org/0000-0001-8605-1199|0000-0001-8605-1199]]
Editorial board membership: [[https://www.tandfonline.com/toc/njql20/current|Journal of Quantitative Linguistics]], 
[[https://sciendo.com/journal/LF|Linguistic Frontiers]]

Other memberships: [[https://www.iqla.org/|The International Quantitative Linguistics Association]]  


===== Focus =====


  *corpus linguistics
  *quantitative linguistics
  *Czech and Arabic language

===== Education =====
  *2010–2016 PhD (Charles University, Prague), thesis: The Theory of Communication as an Explanatory Principle for the Natural Multilevel Text Segmentation
  *2005–2010 MA in Arabic studies and History of Islamic Countries (Charles University, Prague)


===== Employment =====

  *2013–2022 Institute of Comparative Linguistics (Charles University, Prague)
  *2017–2024 Institute of the Czech National Corpus (Charles University, Prague)
  *2024–now Institute of Linguistics (Charles University, Prague)
===== Papers =====


==== Preprints ====
  * Milička, J., & Bednářová, H. (2026). Sydney Telling Fables on AI and Humans: A Corpus Tracing Memetic Transfer of Persona between LLMs. arXiv preprint https://arxiv.org/abs/2602.22481.
  * Milička, J., Marklová, A., & Cvrček, V. (2025). Benchmark of stylistic variation in LLM-generated texts. arXiv preprint http://arxiv.org/abs/2509.10179.
  * Milička, J., Marklová, A., & Cvrček, V. (2025). AI Brown and AI Koditex: LLM-Generated Corpora Comparable to Traditional Corpora of English and Czech Texts. arXiv preprint https://arxiv.org/abs/2509.22996.  
  * Milička, J. (2024). Theoretical and Methodological Framework for Studying Texts Produced by Large Language Models. arXiv [Cs.CL]. Retrieved from https://arxiv.org/abs/2408.16740

==== 2026 ====
  * Marklová, A., Vinš, O., Vokáčová, M., & Milička, J. (2025). The author is dead, but what if they never lived? A reception experiment on Czech AI- and human-authored poetry. Digital Scholarship in the Humanities, [[https://doi.org/10.1093/llc/fqag067|fqag067]]
  *Milička, J., & Machálek T.: AI Corpus Linguist: More than a Year of Experience. SIGHUM (LaTeCH-CLfL) workshop at EACL 2026. [[https://aclanthology.org/2026.latechclfl-1.29/|EACL 2026]]
  *Nádvorníková, O., Milička, J. & Rosen, A.: Exploring crosslinguistic and genre variation in syntactic complexity: insights from a multilingual parallel corpus. Linguistics Vanguard. https://doi.org/10.1515/lingvan-2025-0109
                    
==== 2025 ====
  *Marklová, A., Milička, J., Ryvkin, L., Lacková Bennet, L. U., & Kormaníková, L. (2025). Iconicity in large language models. Digital Scholarship in the Humanities, [[https://academic.oup.com/dsh/advance-article/doi/10.1093/llc/fqaf095/8257789|DSH]]. 
  * Milička, J., Marklová, A., Drobil, O., & Pospíšilová, E. (2025). Learning to detect AI texts and learning the limits. PloS one, 20(10), [[https://doi.org/10.1371/journal.pone.0333007|e0333007]].
  * Milička, J. (2025). Simple stochastic processes behind Menzerath’s Law. In A. Pawłowski, S. Embleton, J. Mačutek, & A. Xanthos (Eds.), //Mathematical Modelling in Linguistics and Text Analysis: Theory and applications// (pp. 43–59). Amsterdam: John Benjamins. [[https://www.benjamins.com/catalog/cilt.370.04mil]] Preprint available at [[http://arxiv.org/abs/2409.00279]].
  * Milička, J., Marklová, A., Láznička, M., Diatka, V., Bednářová, H., Matela, J., & Škrabal, M. (2025). Sources of Intelligibility of Distant Languages: An Empirical Study. //Language and Speech, 0(0)//. [[https://doi.org/10.1177/00238309251345952]]
==== 2024 ====
  * Milička, J., Marklová, A., VanSlambrouck, K., Pospíšilová, E., Šimsová, J., Harvan, S., & Drobil, O. (2024). Large language models are able to downplay their cognitive abilities to fit the persona they simulate. //Plos one, 19(3), [[https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0298522|e0298522]]//.
  * Milička, J., & Šebestová, D. (2024). Query a corpus in near-natural language: A human-friendly corpus query language not only for linguists. In S. Buschfeld, P. Ronan, T. Neumaier, A. Weilinghoff, & L. Westermayer (Eds.), //Crossing Boundaries through Corpora: Innovative Approaches to Corpus Linguistics.// John Benjamins. ISBN 9789027215949.

==== 2023 ====
  * Milička, J. (2023). Menzerath’s law: Is it just regression toward the mean? //Glottometrics//, 55. doi: [[https://glottometrics.iqla.org/409-menzeraths-law-is-it-just-regression-toward-the-mean/|10.53482/2023_55_409]].
==== 2022 ====
  * Milička, J., Cvrček, V., & Lukeš, D.: Unpacking lexical intertextuality: Vocabulary shared among texts. Yamazaki, M., Sanada, H., Köhler, R., Embleton, S., Vulanović, R., & Wheeler, E. S. (Eds). //Quantitative Approaches to Universality and Individuality in Language//. Berlin/Boston: De Gruyter Mouton. 101-116. DOI: [[https://doi.org/10.1515/9783110763560-009|10.1515/9783110763560-009]]
   *Zemánek, P. & Milička, J.: Frankové očima Arabů v klasickém a moderním období. In O. Lomová, J. Malečková & K. Šíma (Eds.), //Setkávání kultur. Identity, ideologie, jazyky// (pp. 233-246). Praha:  Univerzita Karlova, Filozofická fakulta. ISBN 978-80-7671-085-6.


==== 2021 ====
   *Milička, J., Cvrček, V., & Lukešová, L.: Modelling crosslinguistic n‑gram correspondence in typologically different languages. //Languages in Contrast 21(2)//, 217-249. DOI: [[http://doi.org/10.1075/lic.19018.mil|10.1075/lic.19018.mil]]. ISSN: 1387-6759.
   *Milička, J., & Houzar, A.: Phonological properties as predictors of text success. In A. Pawłowski, S. Embleton, J. Mačutek and G. Mikros (eds.), //Language and Text: Data, models, information and applications// (pp. 177–194). John Benjamins. ISBN 9789027210104.
   *Matlach, V., Krivochen, D. G., & Milička, J.: A method for the comparison of general sequences via type-token ratio. In A. Pawłowski, S. Embleton, J. Mačutek and G. Mikros (eds.), //Language and Text: Data, models, information and applications// (pp. 37–54). John Benjamins. ISBN 9789027210104.
   *Malá, M., Šebestová, D., & Milička, J.: The expression of time in English and Czech children’s literature. In A. Čermáková, T. Egan, H. Hasselgård &  S. Rørvik (eds.), //Time in Languages, Languages in Time// (pp 283–304). John Benjamins. ISBN 978-90-272-0968-9.
   *Kubát, M., Hůla, J., Chen, X., Čech, R., & Milička, J.: The lexical context in a style analysis: A word embeddings approach. //Corpus Linguistics and Linguistic Theory, 17(2)//, 443-464.

==== 2020 ====
   *Milička, J.: Kolik procent lexikálních výpůjček můžeme očekávat ve slovenském textu?. //[[https://www.juls.savba.sk/ediela/sr/2020/1/sr20-1.pdf|Slovenská reč, 85(1)]]//, 76–81.
   *Kováříková, D., Škrabal, M., Cvrček, V., Lukešová, L., & Milička, J.: Lexicographer’s Lacunas or How to Deal with Missing Representative Dictionary Forms on the Example of Czech. //International Journal of Lexicography, 33(1)//, 90-103.

==== 2019 ====
   *Mačutek, J., Čech, R., & Milička, J.: Length of non-projective sentences: A pilot study using a Czech UD treebank. In //Proceedings of the First Workshop on Quantitative Syntax (Quasy, SyntaxFest 2019)// (pp. 110–117). ISBN 978-1-950737-65-9.
   *Čech, R., Hůla, J., Kubát, M., Chen, X., & Milička, J.: The development of context specificity of lemma. A word embeddings approach. //Journal of Quantitative Linguistics, 26(3)//, 187-204.
   *Hůla, J., Kubát, M., Čech, R., Chen, X., Číž, D., Pelegrinová, K., & Milička, J.: Context Specificity of Lemma. Diachronic Analysis. //Glottometrics 45 2019//, 7.

==== 2018 ====
   *Juola, P., Milička, J., & Zemánek, P.: Authorship and time attribution of Arabic texts using JGAAP. In K. Shaalan, A. E. Hassanien & F. Tolba (eds.),  //Intelligent Natural Language Processing: Trends and Applications// (pp. 325–349). Springer, Cham. ISBN: 978-3-319-67056-0.
   *Milička, J.: Average Word Length from the Diachronic Perspective: The Case of Arabic. //Linguistic Frontiers, 1(2)//, 81-89.
   *Milička, J., & Kalábová, H.: Vowel Disharmony in Czech Words and Stems. In  M. Fidler & V. Cvrček (eds.), //Taming the Corpus: From Inflection and Lexis to Interpretation// (pp. 37–61). Springer, Cham. ISBN: 978-3-319-98017-1.
   *Čech, R., Milička, J., Mačutek, J., Koščová, M., & Lopatková, M.: Quantitative Analysis of Syntactic Dependency in Czech. In J. Jiang & H. Liu (eds.), //Quantitative Analysis of Dependency Structures// (pp 53–70). ISBN: 978-3-11-057356-5.

==== 2017 ====
   *Diatka, V., & Milička, J: The effect of iconicity flash blindness: An empirical study. In A. Zirker, M. Bauer, O. Fisher & C. Ljungberg (eds.), //Dimensions of Iconicity// (pp 3–14). John Benjamins. ISBN 978-90-272-4351-5.
   *Mačutek, J., Čech, R., & Milička, J.: [[https://aclanthology.org/volumes/W17-65/|Menzerath-Altmann Law in Syntactic Dependency Structure]]. In S. Montemagni & J. Nivre (eds.), //Proceedings of the Fourth International Conference on Dependency Linguistics (Depling 2017), September 18-20, 2017, Università di Pisa, Italy (No. 139, pp. 100–107).// Linköping University Electronic Press. ISBN: 978-91-7685-467-9.

==== 2016 ====
   *Milička, J.: Key Length Motifs in Czech and Arabic Texts. In E. Kelih, R. Knight, J. Mačutek & A. Wilson (eds.), //Studies in Quantitative Linguisitcs 23//. (pp. 27–42). RAM – Verlag. ISBN: 978-3-942303-44-6.
   *Čéplö, S., Bátora, J., Benkato, A., Milička, J., Pereira, C., & Zemánek, P.: Mutual intelligibility of spoken Maltese, Libyan Arabic, and Tunisian Arabic functionally tested: A pilot study. //Folia Linguistica, 50(2)//, 583-628.
   *Zemánek, P., & Milička, J.: Restricted collocability and its use in Arabic Corpus Linguistics. In G. C. Pastor (ed.), //Computerised and Corpus-based Approaches to Phraseology: Monolingual and Multilingual Perspectives.// (pp. 67–78). Tradulex. ISBN: 978-2-9700736-5-9.
==== 2015 ====
   *Milička, J.: Synergetic Linguistics: Do We Need Better Explanatory Mechanism?. //Glottotheory, 6(2)//, 291-298.
   *Milička, J.: Is the Distribution of L-Motifs Inherited from the Word Lengths Distribution?. In G. K. Mikros & J. Mačutek (eds.) //Sequences in Language and Text// (pp 133–146). De Gruyter. ISBN: 978-3-11-036273-2.
   *Milička, J.: Is Menzerath’s Law a consequence of segment inventory inhomogeneity?. //Czech and Slovak Linguistic Review, 2015(2)//, 62-71.
==== 2014 ====
   *Milička, J.: Menzerath’s law: the whole is greater than the sum of its parts. //Journal of Quantitative Linguistics, 21(2)//, 85-99.
   *Mikros, G., & Milička, J.: Distribution of the Menzerath’s law on the syllable level in Greek texts. In G. Altmann, R. Čech, J. Mačutek & L. Uhlířová (eds). //Empirical approaches to text and language analysis// (pp 180–189). RAM - Verlag. ISBN 978-3-942303-24-8.
   *Zemánek, P., & Milička, J.: [[https://aclanthology.org/volumes/W14-09/|Quotations, relevance and time depth: Medieval Arabic literature in grids and networks]]. In A. Feldman, A. Kazantseva, & S. Szpakowicz (eds.) //Proceedings of the 3rd Workshop on Computational Linguistics for Literature (CLFL)// (pp. 17–24). ISBN 978-1-937284-88-6.
   *Zemánek, P., & Milička, J.: Ranking Search Results for Arabic Diachronic Corpora. Google-like search engine for (non) linguists. In A. Lakhouaja (ed.), //Proceedings of the 5th International Conference on Arabic Language Processing (CITALA 2014)// (pp. 73–78). Oujda.  

==== 2013 ====
   *Kubát, M., & Milička, J.: Vocabulary richness measure in genres. //Journal of Quantitative Linguistics, 20(4)//, 339-349.
   *Milička, J.: Rank-frequency relation & type-token relation: Two sides of the same coin. In I. Obradović, E. Kelih & R. Köhler (eds.), //Methods and Applications of Quantitative Linguistics: Selected papers of the 8th International Conference on Quantitative Linguistics (QUALICO)// (pp. 163–171). ISBN 978-86-7466-465-0.
==== 2012 ====
   *Milička, J.: Minimal ratio: an exact metric for keywords, collocations etc. //Czech and Slovak Linguistic Review, 2012(1)//, 62-70.
   *Chromý, J., & Milička, J.: Experimentální zkoumání stylotvorných faktorů: první výstupy. //Naše řeč (Our Speech), 95(4)//, 181-187.

==== 2011 ====
   *Milička, J.: A Combinatorial Method for a Context Comparison. In: E. Kelih, V. Levickij &Y. Matskulyak (eds.), //Issues in Quantitative Linguistics 2// (pp 104–109). RAM – Verlag. ISBN 978-3-942303-07-1.
   *Milička, J.: [[https://www.birmingham.ac.uk/Documents/college-artslaw/corpus/conference-archives/2011/Paper193.pdf|Valency and Information Structure: A quantitative approach to from–to juxtaposition in Arabic]]. In //Proceedings of CLC 2011//. 

==== 2010 ====
  *Milička, J.: Budování česko-arabského paralelního korpusu. In F. Čermák & J. Kocek (eds.), //Mnohojazyčný korpus Intercorp: Možnosti studia// (pp 221–225). Nakladatelství Lidových novin. ISBN 978-80-7422-058-6.

==== 2009 ====
   *Milička, J.: Type-token & hapax-token relation: A combinatorial model. //Glottotheory, 2(1)//, 99-110.
   *Milička, J.: [[http://milicka.ff.cuni.cz/kestazeni/clanek_Novy_Orient.pdf|Knihtisk v dějinách islámské kultury]]. //Nový orient, 64(2)//, 42–46.

===== Applications =====

    *[[http://alpha.korpus.cz|Alpha]]: Překladač z přirozeného jazyka do CQL (viz [[en:manualy:alpha|info]])
    *[[http://milicka.cz/en/engrammer|Engrammer]]: Nástroj pro explorativní analýzu kolokací.
    *[[http://www.milicka.cz/keyworder|KeyWorder]]: Program pro rozpoznávání klíčových slov v textu pomocí minimálního poměru.
    *[[http://www.milicka.cz/typetokener|TypeTokener]]: Program, který měří type-token relation, hapax-token relation atd. zvoleného textu a následně pomocí změřené distribuce typů tyto veličiny zpětně modeluje.
    *[[http://www.milicka.cz/lexicographerscalculator|Lexicographers' Calculator]]: Program pro plánování rozsahu korpusu.
    *[[http://milicka.cz/tinfi|Tinfi]]: Program, který označuje části textu, jež z něj vyčnívají.
    *[[http://milicka.cz/blacksquare|BlackSquare]]: Program pro jednoduché (nejen) lingvistické experimenty.
    *[[http://zumky.com/|Zumky]]: Komunikační nástroj pro všechny, kteří si váží svého času, klidu a soukromí.


===== Books =====
   *Zemánek, P., & Milička, J. (2017): Words Lost and Found: The Diachronic Dynamics of the Arabic Lexicon. RAM-Verlag. 234 p. ISBN: 978-3-942303-45-3.
   *Zemánek, P., Milička, J., & Ondráš, F. (2017): Al-haraka baraka. Strukturně-variační pohled na středověká arabská přísloví a rčení. Univerzita Karlova, Filozofická fakulta. 167 p. ISBN 978-80-7308-749-4.

===== Theses =====
  *Milička, J. (2022): //[[http://milicka.cz/habilitace.pdf|Lexikální diverzita]]// (Habilitation thesis).
  *Milička, J. (2015): //[[http://milicka.cz/disertace.pdf|Teorie komunikace jakožto explanatorní princip přirozené víceúrovňové segmentace textů]]// (PhD thesis).

===== Reviews =====
   *Milička, J. (2014): Kontroverzní hranice jazykovědy aneb O syntagmatických očích Hany Karadžičové [Review of Kvantitativní analýza kontextu by V. Cvrček]. //Naše řeč, (4-5)//, 300-304.
   *Milička, J. (2018): Kapitoly z korpusové versologie — cesta správným směrem [Review of Kapitoly z korpusové versologie, by P. Plecháč & R. Kolár]. //Česká Literatura, 66(2)//, 286–289.

===== Presentations =====
    *2026/6 (JM) hands on workshop //Využítí AI při empirickém lingvistickém výzkumu// Bohemistický kongres 2026, Prague, Czech Republic (invited).
    *2026/5 (JM, Laura Janda, Dominika Kováříková): presentation //Grammatical and Syntactic bias in Large Language Models// ICAME 47, Koblenz, Germany.
    *2026/5 (Ondřej Tichý, Barbora Bulantová, Magdalena Titlbachová, AnnaMarklová, JM) presentation //Benchmarking Large Language Models for Linguistic Research// ICAME 47, Koblenz, Germany.
    *2026/3 (JM, Tomáš Machálek): poster //AI Corpus Linguist: More than a Year of Experience// SIGHUM (LaTeCH-CLfL) 2026 workshop at EACL, Rabat, Morocco. 
    *6/2025 (JM, Anna Marklová, Václav Cvrček) Presentation //AI-Brown// at Corpus Linguistics 2025 conference (Birmingham, Great Britain).
    *6/2025 Presentation //Comparison of Human-Written and LLM-Generated Texts: An incomplete QL Package// at QUALICO conference (Brno, Czech Republic).
    *6/2025 (Anna Marklová, JM, Eva Pospíšilová, Ondřej Drobil) Presentation //The Future of Corpus Linguistics in the World of Large Language Models// at ICAME conference (Vilnius, Lithuania).
    *2/2025 (JM, Anna Marklová) Presentation //Register variability in machine-generated texts// at ISCA/ITG Workshop (Berlin, Germany).
    *10/2024 (Dominika Kováříková, JM, Václav Cvrček, Michal Láznička) Presentation //Unlocking Lexical Meaning 
through Grammatical Profiling.// at EURALEX conference (Cavtat, Croatia).
    *6/2024 (JM, Anna Marklová, Václav Cvrček) Presentation //Exploring register variation in human and machine-generated texts: A comparative analysis.// at ICAME conference (Vigo, Spain).
    *6/2024 Presentation //Mechanical Corpus Linguist// at 4EU+ AI Days Conference (Prague, ČR).
    *5/2024 Presentation //Let’s Delve into the Intricate Tapestry of the Chatgptese// at International Workshop on Corpus and Computational Linguistics (Ostrava, Czech Republic, invited).
    *5/2024 Presentation //Exploring Habibi Corpus: Mapping latent space to real geographic space// at AIDA conference (Valletta, Malta).
    *2/2024 Presentation //Not Your Training Data – Not Your Culture: Exploring Variations in Gender Bias in Large Language Models// at Gender, Technology, and Digital Cultures in the Middle East Conference (Doha, Qatar, invited).
    *11/2023 Presentation //Hledání v korpusech pomocí velkých jazykových modelů: příklady z lingvistiky a dalších oborů// at Humanitní a společenské vědy perspektivou Digital Humanities (Olomouc, invited).
  *9/2023 Presentation //Our Timelines// at [[http://milicka.ff.cuni.cz/AIAL2023|AIAL2023]] (Towards AI-Aided Human-Supervised Linguistics, Prague, organizer)
  *6/2023 Presentation //Modelling Menzerath’s Law with Gaussian Copula// at the QUALICO 2023 conference (Lausanne).
  *6/2023 Presentation //A Guided Tour through the Labyrinth of Lexical Diversity// at the International Workshop on Corpus Stylistics and Stylometrics (Ostrava, invited).
  *6/2023 (JM and Petr Zemánek) Poster //Principal Component Analysis of Written Arabic Dialects// at the Olinco 2023 conference (Olomouc, Best Poster Award).
  *11/2022 (JM and Dominika Kováříková) Presentation //Jak vytěžit textová data Českého národního korpusu pomocí KonTextu (Textual data mining from the Czech National Corpus using KonText)// at the conference Digitální data perspektivou humanitního vědce (Digital Data from a Humanities Perspective) (Brno, hybrid, invited).
  *11/2022 Presentation //Engrammer, nástroj na automatickou extrakci frazeologie (Engrammer, a tool for automatic extraction of phraseology)// at the workshop Vývoj elektronické lexikální databáze indoíránských jazyků a podpora zavádění moderních technologií do výuky jazyků (Development of an Electronic Lexical Database of Indo-Iranian Languages and Support for Introducing Modern Technologies into Language Teaching) (Prague, invited).
  *5/2022 Presentation //The Menzerath-Altmann Law: Time to move on// at the III. Summer Workshop for Statistics in Linguistics (Trojanovice, invited).
  *5/2022 Presentation //Measuring lexical diversity: The influence of lemmatization// at the colloquium SlavLingColl (Berlin, invited).
  *9/2021 (JM, Václav Cvrček, and David Lukeš) Presentation //Unpacking Lexical Intertextuality – Number of Types Shared Among Texts// at the QUALICO conference (Tokyo, online).
  *8/2021 (JM and Denisa Šebestová) Presentation //Human Friendly Corpus Query Language// at the ICAME conference (Dortmund).
  *11/2019 Presentation //Engrammer — On the borders between language and other cultural phenomena that can be quantitatively analyzed via corpus// at the Corpus Driven Quantitative Linguistics Workshop (in Ostrava; invited).
  *9/2019 (JM and Denisa Šebestová) Presentation //Engrammer: Introducing a new tool for the identification of phraseological patterning. Demo and case study on Czech, English, and Arabic// at the EUROPHRAS conference (Málaga).
  *8/2019 (Ján Mačutek, Radek Čech, and JM) Presentation //Length of non-projective sentences: A pilot study using a Czech UD treebank// at the Quasy conference held during SyntaxFest 2019, Paris.
  *7/2019 (JM, Václav Cvrček, and Lucie Lukešová) Presentation //N-gram Length Correspondence in Typologically Different Languages// at the CL2019 Cardiff conference.
  *6/2019 (Denisa Šebestová, Markéta Malá, and JM) Presentation //The expression of time in English and Czech children’s literature: A contrastive phraseological perspective// at the ICAME conference (Neuchatel).
  *3/2019 Presentation //Analysis of Liberal Translations and Cross-Language Plagiarism// at the Linguistic Afternoon 2019 meeting (Olomouc, invited).
  *9/2018 (JM and Alžběta Růžičková) Presentation //Slovak Vowel Phonotactics: Slavic Origins vs. Hungarian Influences// at the SlaviCorp conference (Prague).
  *7/2018 (JM and Alžběta Růžičková) Presentation //Demand and Supply in the Communication Process: The Case of Lexical Richness and Phonological Features// at the QUALICO conference (Wroclaw).
  *9/2017 (Jan Mačutek, Radek Čech, and JM) Presentation and poster //Menzerath-Altmann Law in Syntactic Dependency Structure// at the Depling conference (Pisa).
  *5/2017 (JM and Hana Kalábová) Presentation //Vowel Disharmony in Czech: Description and Explanation// at the Linguistics Prague conference.
  *3/2017 Presentation //From – To Construction in Arabic and Czech// at the Word Order and Information Structure: a Cross- and Intra-Linguistic Perspective conference (Olomouc; invited).
  *2/2017 Presentation //Menzerathův-Altmannův zákon: adorovaný model podivného vztahu (Menzerath's-Altmann's Law: An Idolised model of a strange relationship)// at the colloquium  Kritické pohledy na Menzerathův-Altmannův zákon (Critical Views on Menzerath's-Altmann's Law) (Ostrava; invited).
  *8/2016 (JM and Karolína Vyskočilová) Presentation //Models of noisy channels that speech gets over// at the QUALICO conference (Trier).
  *12/2015 (JM and Petr Zemánek) Presentation //Tolerant algorithm for quotation extraction// at the Digital Arabic and Persian Research Workshop (Leipzig; invited).
  *11/2015 Poster //From Linguistic Theory to an Effective Quotation Extraction Algorithm// at the symposium Methods and Linguistic Theories (MaLT 2015) (Bamberg).
  *10/2015 (Vojtěch Diatka and JM) Presentation //Můžou se neikonická slova někdy chovat jako ikonická? (Can non-iconic words sometimes behave like iconic ones?)// at the Lingvistika Praha (Linguistics Prague) conference.
  *7/2015 (JM and Petr Zemánek) Poster //Hypertextualizer. Quotation Extraction Software// at the Corpus Linguistics 2015 conference (Lancaster).
  *7/2015 (Vojtěch Diatka and JM) Poster //The Iconicity of the "Non-Iconic Words" and its Effects on Language Processing// at the 12th International Symposium of Psycholinguistics (Valencia).
  *6/2015 (JM and Petr Zemánek) Presentation //Restricted Collocability and its Use in Arabic Corpus Linguistics// at the EUROPHRAS 2015 conference (Malaga).
  *3/2015 (Vojtěch Diatka, Jiří Milička) Presentation //Are Iconic Words Statistically more Iconic than Non-Iconic Ones? A New Method of Testing// at the 10th International Symposium on Iconicity in Language and Literature (Tübingen).
  *6/2014 Presentation //Three Models for the Menzerath's Law// at the QUALICO conference (organized by IQLA).
  *5/2014 Presentation //Konfidenční intervaly v empirické lingvistice (Confidence intervals in empirical linguistics)// at the Lingvistika Praha (Linguistics Prague) conference.
  *4/2014 (JM and Petr Zemánek) Presentation //Quotations, Relevance, and Time Depth: Medieval Arabic Literature in Grids and Networks// at the EACL conference in Gothenburg (organized by the Association for Computational Linguistics).
  *7/2012 Presentation //Rank-frequency Relation & Type-token Relation: Two Sides of the Same Coin// at the QUALICO conference.
  *7/2011 Presentation //Valency and the Information Structure. A Quantitative Approach// at the Corpus Linguistics Conference in Birmingham.
  *4/2011 (Petr Zemánek and JM) Presentation //Arabic Plurals in Context. A Corpus Study// at the Workshop on Arabic Corpus Linguistics in Lancaster.
  *9/2009 Presentation //Budování česko-arabského paralelního korpusu (Building the Czech-Arabic Parallel Corpus)// at the Intercorp conference in Prague.


===== Translations into Czech =====


  * Muntasir al-Qaffáš: On. In //Antologie moderních arabských povídek.// Praha 2011, pp 93-97.
  * (Translated with Anna Humlová) Alí ad-Du'áží: //Po hospodách kolem Středozemního moře.// Praha, Malvern 2013, 76 s.


===== Teaching =====

  *Previously taught
    * Arabic and Corpus
    * Introduction to Quantitative Linguistics
    * Writing an article on a corpus-linguistic topic
  *Currently taught
    * General Linguistic Laws in texts
    * Use of Large Language Models
  *I am currently involved in courses
    * Working with corpora: Case studies
    * Introduction to linguistic corpora


===== Internships =====

   * 4/2013-6/2013 Internship at the University of Trier.
   * 10/2013-11/2013 Internship at the National and Kapodistrian University of Athens.