AplikaceAplikace
Nastavení

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revisionBoth sides next revision
en:start [2016/12/14 21:49] vaclavcvrceken:start [2016/12/14 21:49] – [What is a corpus?] vaclavcvrcek
Line 91: Line 91:
 ===== What is a corpus? ===== ===== What is a corpus? =====
  
-A language [[en:pojmy:korpus|corpus]] is an extensive collection of **authentic textual data** (written or spoken) converted to **electronic form** in a uniform format, meaning that it can easily be **searched** for various linguistic phenomena -- especially words and phrases ([[en:pojmy:kolokace|collocations]]). Corpora differ from a plain text archive or database primarily because they have been carefully compiled with the research purpose in mind (they should, for example, represent contemporary spoken language or written language or one of its parts, e.g. journalistic texts). A corpus displays linguistic phenomena in their **natural context**, which allows us to do language research based on actual data on a scale so large that it would have previously been unthinkable.+A language corpus is an extensive collection of **authentic textual data** (written or spoken) converted to **electronic form** in a uniform format, meaning that it can easily be **searched** for various linguistic phenomena -- especially words and phrases (collocations). Corpora differ from a plain text archive or database primarily because they have been carefully compiled with the research purpose in mind (they should, for example, represent contemporary spoken language or written language or one of its parts, e.g. journalistic texts). A corpus displays linguistic phenomena in their **natural context**, which allows us to do language research based on actual data on a scale so large that it would have previously been unthinkable.