Skrýt
Nastavení

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

en:pojmy:word [2016/12/08 13:57]
Veronika Pojarová created
en:pojmy:word [2016/12/08 14:04] (current)
Veronika Pojarová [Word form (word)]
Line 3: Line 3:
 A word form (known as a //word// in corpus terminology) is a unit which remains **morphologically** (and possibly also **orthographically**) specific. With its generality it stands between a [[en:​pojmy:​token|token]] and a [[en:​pojmy:​lemma|lemma]]. A word form (known as a //word// in corpus terminology) is a unit which remains **morphologically** (and possibly also **orthographically**) specific. With its generality it stands between a [[en:​pojmy:​token|token]] and a [[en:​pojmy:​lemma|lemma]].
  
-While a [[en:​pojmy:​token|token]] is one specific realization of a given unit, a word form is a standardized unit; a [[en:​pojmy:​typ|type]]. E.g. the word form //chceme// can have a great number of different realizations(tokens); in the SYN2010 corpus it is 5627.+While a [[en:​pojmy:​token|token]] is one specific realization of a given unit, a word form is a standardized unit; a [[en:​pojmy:​typ|type]]. E.g. the word form //eat// can have a great number of different realizations(tokens).
  
-A[[en:​pojmy:​lemma|lemma]] is a unit on yet a higher level of abstraction, ​protože odhlíží od morfologických a pravopisných charakteristikSlovní tvary //chtítchcemechtělchtíti// mají stejné ​lemma //chtít//. Ve většině přístupů se navíc na úrovni slovních tvarů rozlišuje i velikost písmen ​(formy //chce//, //Chce// a //CHCE// jsou považovány za různé slovní tvary). Na rozdíl od //lemmatu//, které je možné chápat jako množinu tvarůje tedy //​word// ​jen jediný tvar daného slova.+On the other hand a [[en:​pojmy:​lemma|lemma]] is a unit on yet a higher level of abstraction, ​because it disregards morphological and orthographic characteristicsThe word forms //eateatsateeaten// have the same lemma //eat//. Additionally,​ on the level of word forms, most approaches differentiate between lower- and upper-case letters ​(the forms //eats//, //Eats// a //EATS// are considered to be different word forms). Unlike a //lemma//, which can be understood as a set of different forms//​word// ​is a single form of a given word.
  
 ==== Related links ==== ==== Related links ====