Nastavení

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

en:pojmy:word [2016/12/08 13:57]
Veronika Pojarová created
en:pojmy:word [2016/12/08 14:04] (current)
Veronika Pojarová [Word form (word)]
Line 3: Line 3:
 A word form (known as a //word// in corpus terminology) is a unit which remains **morphologically** (and possibly also **orthographically**) specific. With its generality it stands between a [[en:pojmy:token|token]] and a [[en:pojmy:lemma|lemma]]. A word form (known as a //word// in corpus terminology) is a unit which remains **morphologically** (and possibly also **orthographically**) specific. With its generality it stands between a [[en:pojmy:token|token]] and a [[en:pojmy:lemma|lemma]].
  
-While a [[en:pojmy:token|token]] is one specific realization of a given unit, a word form is a standardized unit; a [[en:pojmy:typ|type]]. E.g. the word form //chceme// can have a great number of different realizations(tokens); in the SYN2010 corpus it is 5627.+While a [[en:pojmy:token|token]] is one specific realization of a given unit, a word form is a standardized unit; a [[en:pojmy:typ|type]]. E.g. the word form //eat// can have a great number of different realizations(tokens).
  
-A[[en:pojmy:lemma|lemma]] is a unit on yet a higher level of abstraction, protože odhlíží od morfologických a pravopisných charakteristikSlovní tvary //chtítchcemechtělchtíti// mají stejné lemma //chtít//. Ve většině přístupů se navíc na úrovni slovních tvarů rozlišuje i velikost písmen (formy //chce//, //Chce// a //CHCE// jsou považovány za různé slovní tvary). Na rozdíl od //lemmatu//, které je možné chápat jako množinu tvarůje tedy //word// jen jediný tvar daného slova.+On the other hand a [[en:pojmy:lemma|lemma]] is a unit on yet a higher level of abstraction, because it disregards morphological and orthographic characteristicsThe word forms //eateatsateeaten// have the same lemma //eat//. Additionally, on the level of word forms, most approaches differentiate between lower- and upper-case letters (the forms //eats//, //Eats// a //EATS// are considered to be different word forms). Unlike a //lemma//, which can be understood as a set of different forms//word// is a single form of a given word.
  
 ==== Related links ==== ==== Related links ====