====== KonText interface: Version history ======

{{ :seznamy:kontext-logo-vetsi.png?nolink&200|}}

The corpus interface KonText is designed for general interaction with [[en:cnk:uvod|CNC corpora]]. A comprehensive list of KonText’s available functions can be found in the [[en:manualy:kontext:index|manual]].

KonText is an extended and visually modified version of the original NoSketch Engine application. It is developed by the [[http://ucnk.ff.cuni.cz/en/|Institute of the Czech National Corpus]] (Faculty of Arts, Charles University) and the [[http://ufal.mff.cuni.cz|Institute of Formal and Applied Linguistics]] (Faculty of Mathematics and Physics, Charles University) under the **GNU GPL 2** license (with Tomáš Machálek and Martin Zimandl as the main developers). Just as the NoSketch Engine, KonText uses [[https://nlp.fi.muni.cz/trac/noske|Manatee]] as its backend.

The version history overview below contains only the most signicant changes as seen from the end-user perspective. **A complete list of all changes and bug fixes can be found on [[https://github.com/czcorpus/kontext/releases|KonText's GitHub page]]** which also hosts a complete source code repository.

===== Release 0.18.0 =====

//Publication date: 7.2.2024//

User changes:

    * new **keyword analysis** module compatible with the [[https://kwords.korpus.cz/|KWords web application]]
    * displaying **translation equivalents directly in a concordance** in parallel corpora by clicking on the selected word (new ''tokens_linking'' plug-in)
    * possibility to download a **list of documents matching selected text types**
    * JSONL as a **new optional format for storing the results** (concordance, word list, collocations, frequency list, document list), where each document line contains a separate JSON string -- the format is particularly suitable for further automated processing
    * improved **linking from external applications** to KonText
      * multi-step operations (e.g. query + filter) with the possibility of subsequent editing in respective query forms
      * support for non-token filter ranges when linking to KonText (e.g. "from ''-1s'' to ''1s''")
    * "Federated Content Search" module supports searching in multiple corpora at the same time

Technical changes:

    * dropped support for Celery as the calculation backend (Rq remains)
    * new internal HTTP client for querying external data sources (authentication, translation equivalents, etc.)
    * improved installation script
    * KonText uses (optionally) a custom modification of Manatee-open with more statistical measures for keyword analysis

===== Release 0.17.0 =====

//Publication date: 17.2.2023//

User changes:

  * **enhanced and refined subcorpora**
    * by default, every subcorpus is available to all users, addressing issues with URLs shared between users
    * if a user does not provide a description, the subcorpus remains undiscoverable
    * a subcorpus can be archived in which case all the URLs are still functional but the subcorpus won't be listed in author's subcorpora (unless explicitly specified in listing filter)
    * on the concordance query page, users can create a **subcorpus draft** from selected text types for future use
    * easily copy a subcorpus or create a new variant
  * a new function displays graphically the **dispersion** of a search term across the corpus data
  * highlighted **translation equivalents** (as retrieved from the Treq application) directly in the parallel concordance
  * sharing **individual frequency tables through exported URLs**
    * when a frequency result page contains multiple tables, users can now easily obtain URLs for each table to share or publish the table
  * in the line selection function, users can navigate to the page with the **first selected line**
    * for manually categorized lines in extensive concordances where the first selected line starts far beyond the initial page, this feature enables automatic location of the first selection
  * customizable "nice" backlinks allow other applications to reference KonText results (available for easier integration with other applications)
  * **detection of overly time-consuming queries** for large corpora (typically the ones producing large result sets) and suggestion of an alternative corpus

Technical changes:

  * core web application framework changed from Gunicorn+Werkzeug to [[https://sanic.dev/en/|Sanic]]
  * upgrade to React 18
  * server backend rewritten with //async/await// 
  * checking of background tasks from the client side is now by default doe via WebSockets
  * support for Manatee 2.2xx
  * improved caching of frequency distribution results for faster navigation between result pages
  * moved from HTTP sessions stored on server to [[https://jwt.io/|JWT]]
  * possibility to apply individual "cutoff" for large concordances


===== Release 0.16.0 =====

//Publication date: 23. 2. 2022//

User changes:
    * new query type: **paradigmatic query**
    * enhanced "word list" query type
        * improved user interface
        * optimalization of saved results for faster subsequent access
    * query history now supports all query types (concordance, word list, paradigmatic query)
    * enhanced frequency distribution
        * **graphical mode**
            * special support for time-based distributions
        * displaying of confidence intervals
        * default display option can now be set by the user (tables vs. figures)
    * enhanced audio playback
        * possibility to shift the playback in time
        * waveform display
    * option to create a subcorpus directly on the concordance query page
    * search suggestion with sublemma support (syn2020, syn_v9) and faster response

Technical changes:
    * integration of a number of modules (e.g. "liveattrs", query history) with an internal database system
    * reorganization of server code
    * transition from CSS files to Styled Components
    * Docker support
    * support for automatic testing of the user interface
    * removing unnecessary attributes from the configuration


===== Release 0.15.0 =====

//Publication date: 18. 12. 2020//

User changes:

    * number of query types reduced to two:
      * advanced (equivalent to the original "CQL")
      * simple
        * multi-word search
        * optional support for regular expressions
        * optional (per corpus) default search attributes
    * new calendar-based widget for specifying date intervals in the "Restrict search" section of the main query form
    * **syntax_viewer** plug-in enhancement -- added support for new features of SYN2020
    * new **query_suggest** plug-in providing interactive help with writing a query
    * **token_connect** plug-in can be now used also as a source for an alternative KWIC detail view
      * added a new module "formatted text"
    * **taghelper** plug-in now supports "key-value" tagsets and it is also possible to define multiple tagsets for a corpus
    * new option for displaying additional positional attributes (below the main text tokens)
    * possibility to set any positional attribute as the main one in the concordance view
    * more user-friendly "Corpus-specific settings" module
    * redesigned "Specify context" section of the main query form
    * possibility to perform more complex queries (billion-word corpora, aligned corpora when querying only the primary language) without the web-server's time limit constraint
    * an archived URL of a frequency distribution or a collocation can be now restored even for complex queries, regardless the web server time-out

Technical changes:
    * server-side rewritten to Python 3
    * added support for a new asychronous task processing backend [[https://python-rq.org/|Rq]]; the new backend is now the default one
    * client-side rewritten using the same framework as in [[manualy:wag|WaG]]
    * synchronization between the web server and the back-end worker queue rewritten in case of concordance calculation
    * changes in HTTP API

===== Release 0.13.0 =====

//Publication date: 9. 12. 2019//

  * rewrite of HTML templates to Jinja2
  * transition to React.JS framework, which resulted mainly in extensive changes of the code and, to a lesser extent, also in user interface elements (e.g. corpus-specific view settings are now in three tabs)
  * preparing future functionality support

===== Release 0.12.0 =====

//Publication date: 30. 10. 2018//

  * translation equivalents based on Treq directly displayed in KonText for parallel corpora (set up for InterCorp v10 and v11)
  * CQL editor with syntax highlighting and basic value validation
  * mixed mode for attribute displaying (directly in text for KWIC, on mouse-over for other tokens)
  * sharing a named subcorpus and its description with other users
  * new filter functions 
    * nested matches filter
    * first hits in docs filter
  * asynchronous exports with notification
  * improved keyboard navigation on the query result page
  * possibility to minimize individual text type boxes in the subcorpus form

===== Release 0.11.0 =====

//Publication date: 15. 12. 2017//

  * 2-dimensional frequency distribution with confidence intervals, including export of the data into Excel
  * added support for undo in the interactive text selection
  * added support for undo in the tag builder
  * improved query history
    * query history items can be archived with a custom name for later reuse
    * full query form is now saved which includes also selected texts
  * i.p.m. on demand calculation now works only in well-defined situations (i.e. subcorpus selected using the respective form, rather than a CQL query)
  * chart depicting line group proportions can be exported into Excel
  * word list
    * more convenient upload and in-browser editing of uploaded black/white-lists
    * it is now possible to go directly to the last page
  * added support for hiding individual columns of parallel corpora in concordance view

===== Release 0.10.0 =====

//Publication date: 11. 4. 2017//

  * for spoken corpora, concordance detail views are rendered as dialogues with clear indication of speaker turns and overlaps where audio can be played back by clicking the "speaker" icon
  * documents for subcorpora can newly also be selected according to user-defined text type ratios
  * individual query processing steps within the breadcrumb navigation can now be edited, allowing the user to change the parameters of previous operations
  * working corpus can now be changed without losing other information from the current query form
  * manually grouped concordance lines are now distinguished by colours
  * web page titles contain query information (for use in bookmarks and better browser history navigation)

===== Release 0.9.0 =====

//Publication date: 26. 9. 2016//

  * displaying of syntactic structures
  * asynchronous creation of subcorpora including a support for creating alignement-based subcorpora from parallel corpora
  * for attributes with long lists of values, a text input auto-complete function has been added for easier subcorpus creation
  * positional attributes can be displayed also on a mouse-over
  * navigation between concordance pages without reloading the whole page
  * frequency distribution and collocation results are now cached on server for faster pagination
  * user-defined numeric concordance line labels can be now renamed/removed
  * added support for displaying line numbers in a concordance

===== Release 0.8.0 =====

//Publication date: 8. 3. 2016//

  * concordance lines can have user-defined numeric labels attached for manual grouping/categorization
  * i.p.m. calculation for ad-hoc subcorpora (on demand; previous versions calculated i.p.m. from the whole corpus which could have been confusing)
  * support for creating subcorpora based on conditions that contain different structures (e.g. <speaker sex="male" /> and <session id="foo.+" />)
  * added a breadcrumb navigation depicting consecutive steps that led to the current query result

===== Release 0.7.0 =====

//Publication date: 5. 10. 2015//

  * new widget for corpus selection including favourite corpora, featured corpora etc.
  * rewritten "View" menu functions
  * enhancements of user interface usability (e.g. adding an aligned corpus)