Differences
This shows you the differences between two versions of the page.
| |
en:pojmy:word [2016/12/08 13:57] – created veronikapojarova | en:pojmy:word [2016/12/08 14:04] (current) – [Word form (word)] veronikapojarova |
---|
A word form (known as a //word// in corpus terminology) is a unit which remains **morphologically** (and possibly also **orthographically**) specific. With its generality it stands between a [[en:pojmy:token|token]] and a [[en:pojmy:lemma|lemma]]. | A word form (known as a //word// in corpus terminology) is a unit which remains **morphologically** (and possibly also **orthographically**) specific. With its generality it stands between a [[en:pojmy:token|token]] and a [[en:pojmy:lemma|lemma]]. |
| |
While a [[en:pojmy:token|token]] is one specific realization of a given unit, a word form is a standardized unit; a [[en:pojmy:typ|type]]. E.g. the word form //chceme// can have a great number of different realizations(tokens); in the SYN2010 corpus it is 5627. | While a [[en:pojmy:token|token]] is one specific realization of a given unit, a word form is a standardized unit; a [[en:pojmy:typ|type]]. E.g. the word form //eat// can have a great number of different realizations(tokens). |
| |
A[[en:pojmy:lemma|lemma]] is a unit on yet a higher level of abstraction, protože odhlíží od morfologických a pravopisných charakteristik. Slovní tvary //chtít, chceme, chtěl, chtíti// mají stejné lemma //chtít//. Ve většině přístupů se navíc na úrovni slovních tvarů rozlišuje i velikost písmen (formy //chce//, //Chce// a //CHCE// jsou považovány za různé slovní tvary). Na rozdíl od //lemmatu//, které je možné chápat jako množinu tvarů, je tedy //word// jen jediný tvar daného slova. | On the other hand a [[en:pojmy:lemma|lemma]] is a unit on yet a higher level of abstraction, because it disregards morphological and orthographic characteristics. The word forms //eat, eats, ate, eaten// have the same lemma //eat//. Additionally, on the level of word forms, most approaches differentiate between lower- and upper-case letters (the forms //eats//, //Eats// a //EATS// are considered to be different word forms). Unlike a //lemma//, which can be understood as a set of different forms, a //word// is a single form of a given word. |
| |
==== Related links ==== | ==== Related links ==== |