This file lists incompatible changes between Sherlock versions which
need to be taken care of in existing front-ends and customizations.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3.8 to 3.9:
o  The configuration language have been redesigned, the new version
   is much stronger (e.g., it allows unlimited overriding of anything
   in cf/local). See doc/config for the description of the new language.
   You will probably have to carefully inspect your config files.
o  The default configuration file cf/sherlock has been split to several
   smaller files corresponding to the basic subsystems; cf/sherlock
   is just a sequence of includes now.
o  Most notable changes of configuration:
     - Search.MaxCatalogueBrackets has been renamed to MaxCatalogBrackets
       to keep consistency.
     - Chewer.CardAttrs has been moved to the Indexer section
       and the syntax of CardAttrs, LinkAttrs and LabelAttrs has been
       changed. It's now a list, so you can override the default settings
       in cf/local without repeating everything; also, it's now possible
       to specify sub-objects instead of just top-level attributes.
o  CONFIG_GATHERER is now set automatically whenever needed, so it's
   no longer needed to enable it manually in your configuration.
o  Added CONFIG_ALLOW_ANY switch, which enables processing of queries
   with ANY and also of negative queries. These are slow (implemented
   by walking through all cards in the index), but reasonable on small
   databases.
o  The custom library modules (CUSTOM_LIB_MODULES) are now linked as
   a separate library (libcustom) and they can now depend on other
   libraries -- the internal ones can be added to the LIBCUSTOM make
   variable, the external ones to LIBCUSTOM_LIBS.
o  Introduced a general analyser mechanism, which allows tagging of documents
   according to their attributes or contents. See cf/analyser. The analyser
   also takes care of automatic language recognition instead of the indexer.
o  The shepherd daemon now supports immediate reloading of configuration,
   use `scontrol reload' or send the SIGHUP signal manually.
o  The default compile-time configuration has been moved from build/sherlock.cfg
   to sherlock/default.cfg.
o  Interface for custom queries slightly improved - gives a chance to fail
   safely when a query key (CUSTOM_MATCH_CACHE_KEY) is too long.
o  centrum/Makefile is included automaticaly in every customization.

3.7.1 to 3.8:
o  Index format has changed.
o  Site compression has been rewritten, the sites now have names.
   The relevant filter variables, object attributes and query language
   operators have changed, see the corresponding doc files for details.
o  The multiplexer is now able to mix answers from multiple sources.
   Its configuration has changed significantly (see cf/sherlock).
o  The format of context blocks returned by the search server
   has been improved, each meta block now contains several flags
   describing how well it matches the query.
o  Configuration directives for declaring databases the search server
   should use have been simplified.

3.7 to 3.7.1:
o  Added the CUSTOM_PROPAGATE_IMAGE_ATTRS hook.

3.6.1 to 3.7:
o  Index format has changed.
o  Due to various sherlockd optimizations, the word, phrase and near
   match counters are no longer reliable (e.g., for query "A" AND "B",
   a document can be skipped, because it does not contain "A" before
   we test if it contains "B"). However, we are keeping them for
   orientation.
o  Interface to custom statistics and custom matchers has changed
   to allow parallel searching in multiple threads. Generally, struct
   query has been split to several different structures. Please see
   free/lib/custom.h for an explanation and don't worry, the changes
   are small.
o  Three bits in card flags are now available for use by customizations.
   They are called CARD_FLAG_CUSTOM[123].
o  Introduced the CUSTOM_MERGE hook, which can influence what happens
   to custom attributes when two cards with the same contents are merged.

3.6 to 3.6.1:
o  Many gatherer parameters regarding retrying and timeouts have changed.

3.5.1 to 3.6:
o  Changed catalog interface.

3.5 to 3.5.1:
o  Custom filter functions no longer have a hard-wired location. Just make
   CONFIG_CUSTOM_FILTER point to the right include file.

3.4 to 3.5:
o  Changed numbering of word types. Type 0 is no longer reserved and the word,
   meta and string types should be sorted according to frequency (0 being most
   frequent) to make better use of index compression. Increased card version
   to "v2" and provided a hook for conversion of word types from "v1" cards.
   Removed compatibility glue for "v0" (i.e., no "v" attribute) cards.
o  Changed the format of search server replies. All replies except for trivial
   syntax error reports are terminated with a special marker to make it possible
   to detect truncated replies. Also, answers to control commands are now in
   the same format as answers to queries, i.e., they contain header, body, footer
   and the end marker. The changes are backward-compatible except for control
   commands. See doc/search for details.
o  Makefiles have been rewritten to support separate object trees. Please
   replace all references to the `obj' directory  in your custom makefiles
   by $(o) and prefix all references to the source tree by $(s).
o  New configure script. Please read doc/install again.

3.3 to 3.4:
o  Customizations must define CUSTOM_INDEX_TYPE and CUSTOM_INDEX_VERSION
   and they should change it with each incompatible change in index format.
   See lib/index.h for comments.
o  Customizations can define their own functions available to the filter
   language. Define CONFIG_CUSTOM_FILTER and see filter/builtin.c for an example.
o  [Centrum] Morphological dictionaries have been moved to the CVS and they
   are automatically installed by `make install', depending on settings
   of CONFIG_UFAL_DICT_{CS,SK} switches.
o  Sherlock library and UCW library are now separate. For includes specific
   to sherlock you need to use #include "sherlock/index.h" et al. Also, all
   modules (except the libraries) now should begin with #include "sherlock/sherlock.h"
   instead of "lib/lib.h".
o  Customizations can also define specific matching rules, see CUSTOM_MATCH_xxx
   in free/lib/custom.h.
o  New interval test operator =# and its case insensitive/unsigned variant =##.
   Switch commands with many such cases are optimized using red-black trees.

3.2.2 to 3.3:
o  Incompatible changes to Shepherd and ReapD protocols.
o  Added `text:<file>' indexer source, allowing indexing of databases without
   the need for creating bucket files.
o  Gatherer database (currently only if Shepherd is used) and parts of the index
   are now compressed.
o  Introduced CONFIG_AREAS mode in which gatherer (Shepherd only) and indexer
   splits the database to multiple areas according to area ID's assigned by the
   filters. Individual areas act independently, e.g., links from one area to
   another are not followed, documents are not merged across areas and search
   can be restricted to a given area by the AREA keyword.
o  The url-equiv file is now treated as a normal config file and it is a part
   of the source tree (custom/cf/url-equiv or cf/url-equiv if the former doesn't exist).
o  The HTML parser now recognizes <!-- robots:... --> comments.

3.2 to 3.2.2:
o  Added custom statistics and late-matching attributes (see free/lib/custom.h
   for an explanation) in a backward-compatible way.

3.1 to 3.2:
o  Added SORTBY xxx ONLY to make sherlockd force Q to be zero and sort only
   on the specified attribute.

3.0 to 3.1:
o  Merging of databases in the search server added. This means that you can
   feed the search server with several indices with a non-empty intersection
   and as long as they contains the card-prints file (generated optionally
   by the indexer), the duplicates are automatically resolved.
o  Site compression on multiple databases now works and it assumes that all
   databases share a common site ID space.

2.6 to 3.0:

o  CONFIG_BARE introduced and an example in "bare/*" added.
o  Each customization now brings its own local configuration in custom/cf/local,
   automatically included at the end of the global config file, and also its
   own filter rules.
o  Configuration switches are set to very conservative values in the default
   config and they should be overriden in the local config file if needed.
o  Indexer now omits almost whole stage 1 if the CONFIG_BARE mode is switched on.

2.5 to 2.6:

o  The "t" attribute in search server results (per-filetype stats) now shows
   only those filetypes which really matched. Turn on CONFIG_COUNT_ALL_FILETYPES
   for the previous behaviour.
o  The "W" attribute in search server results (word stats) now shows the number
   of matched documents as -1 if the search server didn't count the matches yet
   (for example if the query ended up with a shortcut answer).
o  Introduced error code 119 (simple search expression contains only non-indexed words).
o  Introduced redirect brackets (y ...) inside URL brackets (U ...).
o  Replaced (WORD|META)_TYPES_NO_AUTO_ACCENT by (WORD|META)_TYPES_AUTO_ACCENT_ALWAYS_(STRICT|STRIP)
   in definitions of custom word/meta types in custom.h.
