+<sect2>Field Structure and Character Sets
+<label id="field structure and character sets">
+
+<p>
+In order to provide a flexible approach to national character set
+handling, Zebra allows the administrator to configure the set up the
+system to handle any 8-bit character set — including sets that
+require multi-octet diacritics or other multi-octet characters. The
+definition of a character set includes a specification of the
+permissible values, their sort order (this affects the display in the
+SCAN function), and relationships between upper- and lowercase
+characters. Finally, the definition includes the specification of
+space characters for the set.
+
+The operator can define different character sets for different fields,
+typical examples being standard text fields, numerical fields, and
+special-purpose fields such as WWW-style linkages (URx).
+
+The field types, and hence character sets, are associated with data
+elements by the .abs files (see above). The file <tt/default.idx/
+provides the association between field type codes (as used in the .abs
+files) and the character map files (with the .chr suffix). The format
+of the .idx file is as follows
+
+<descrip>
+<tag>index <it/field type code/</tag>This directive introduces a new
+index code. The argument is a one-character code to be used in the
+.abs files to select this particular index type. An index, roughly,
+corresponds to a particular structure attribute during search. Refer
+to section <ref id="search" name="Search">.
+
+<tag>completeness <it/boolean/</tag>This directive enables or disables
+complete field indexing. The value of the <it/boolean/ should be 0
+(disable) or 1. If completeness is enabled, the index entry will
+contain the complete contents of the field (up to a limit), with words
+(non-space characters) separated by single space characters
+(normalized to &dquot; &dquot; on display). When completeness is
+disabled, each word is indexed as a separate entry. Complete subfield
+indexing is most useful for fields which are typically browsed (eg.
+titles, authors, or subjects), or instances where a match on a
+complete subfield is essential (eg. exact title searching). For fields
+where completeness is disabled, the search engine will interpret a
+search containing space characters as a word proximity search.
+
+<tag>charmap <it/filename/</tag> This is the filename of the character
+map to be used for this index for field type.
+</descrip>
+
+The contents of the character map files are structured as follows:
+
+<descrip>
+<tag>lowercase <it/value-set/</tag>This directive introduces the basic
+value set of the field type. The format is an ordered list (without
+spaces) of the characters which may occur in &dquot;words&dquot; of
+the given type. The order of the entries in the list determines the
+sort order of the index. In addition to single characters, the
+following combinations are legal:
+
+<itemize>
+<item>Backslashes may be used to introduce three-digit octal, or
+two-digit hex representations of single characters (preceded by <tt/x/).
+In addition, the combinations
+\\, \\r, \\n, \\t, \\s (space — remember that real space-characters
+may ot occur in the value definition), and \\ are recognised,
+with their usual interpretation.
+
+<item>Curly braces {} may be used to enclose ranges of single
+characters (possibly using the escape convention described in the
+preceding point), eg. {a-z} to entroduce the standard range of ASCII
+characters. Note that the interpretation of such a range depends on
+the concrete representation in your local, physical character set.
+
+<item>Paranthesises () may be used to enclose multi-byte characters -
+eg. diacritics or special national combinations (eg. Spanish
+&dquot;ll&dquot;). When found in the input stream (or a search term),
+these characters are viewed and sorted as a single character, with a
+sorting value depending on the position of the group in the value
+statement.
+</itemize>
+
+<tag>uppercase <it/value-set/</tag>This directive introduces the
+upper-case equivalencis to the value set (if any). The number and
+order of the entries in the list should be the same as in the
+<tt/lowercase/ directive.
+
+<tag>space <it/value-set/</tag>This directive introduces the character
+which separate words in the input stream. Depending on the
+completeness mode of the field in question, these characters either
+terminate an index entry, or delimit individual &dquot;words&dquot; in
+the input stream. The order of the elements is not significant —
+otherwise the representation is the same as for the <tt/upercase/ and
+<tt/lowercase/ directives.
+
+<tag>map <it/value-set/ <it/target/</tag>This directive introduces a
+mapping between each of the members of the value-set on the left to
+the character on the right. The character on the right must occur in
+the value set (the <tt/lowercase/ directive) of the character set, but
+it may be a paranthesis-enclosed multi-octet character. This directive
+may be used to map diacritics to their base characters, or to map
+HTML-style character-representations to their natural form, etc.
+</descrip>
+