-<!-- $Id: tools.xml,v 1.13 2002-09-03 10:46:06 adam Exp $ -->
+<!-- $Id: tools.xml,v 1.23 2003-05-22 16:57:28 mike Exp $ -->
<chapter id="tools"><title>Supporting Tools</title>
<para>
that may be of use to you.
</para>
- <sect2><title id="PQF">Prefix Query Format</title>
+ <sect2 id="PQF"><title>Prefix Query Format</title>
<para>
Since RPN or reverse polish notation is really just a fancy way of
</para>
<para>
- Z39.50 version 3 defines various encoding of terms.
- Use the @term operator to indicate the encoding type:
- <literal>general</literal>, <literal>numeric</literal>,
- <literal>string</literal> (for InternationalString), ..
+ Version 3 of the Z39.50 specification defines various encoding of terms.
+ Use <literal>@term </literal> <replaceable>type</replaceable>
+ <replaceable>string</replaceable>,
+ where type is one of: <literal>general</literal>,
+ <literal>numeric</literal> or <literal>string</literal>
+ (for InternationalString).
If no term type has been given, the <literal>general</literal> form
- is used which is the only encoding allowed in both version 2 - and 3
+ is used. This is the only encoding allowed in both versions 2 and 3
of the Z39.50 standard.
</para>
- <para>
- The following are all examples of valid queries in the PQF.
- </para>
-
- <screen>
- dylan
-
- "bob dylan"
-
- @or "dylan" "zimmerman"
-
- @set Result-1
-
- @or @and bob dylan @set Result-1
-
- @attr 1=4 computer
-
- @attr 4=1 @and @attr 1=1 "bob dylan" @attr 1=4 "slow train coming"
-
- @attr 4=1 @attr 1=4 "self portrait"
-
- @prox 0 3 1 2 k 2 dylan zimmerman
-
- @and @attr 2=4 @attr gils 1=2038 -114 @attr 2=2 @attr gils 1=2039 -109
-
- @term string "a UTF-8 string, maybe?"
+ <sect3 id="PQF-prox">
+ <title>Using Proximity Operators with PQF</title>
+ <note>
+ <para>
+ This is an advanced topic, describing how to construct
+ queries that make very specific requirements on the
+ relative location of their operands.
+ You may wish to skip this section and go straight to
+ <link linkend="pqf-examples">the example PQF queries</link>.
+ </para>
+ <para>
+ <warning>
+ <para>
+ Most Z39.50 servers do not support proximity searching, or
+ support only a small subset of the full functionality that
+ can be expressed using the PQF proximity operator. Be
+ aware that the ability to <emphasis>express</emphasis> a
+ query in PQF is no guarantee that any given server will
+ be able to <emphasis>execute</emphasis> it.
+ </para>
+ </warning>
+ </para>
+ </note>
+ <para>
+ The proximity operator <literal>@prox</literal> is a special
+ and more restrictive version of the conjunction operator
+ <literal>@and</literal>. Its semantics are described in
+ section 3.7.2 (Proximity) of Z39.50 the standard itself, which
+ can be read on-line at
+ <ulink url="http://lcweb.loc.gov/z3950/agency/markup/09.html"/>
+ </para>
+ <para>
+ In PQF, the proximity operation is represented by a sequence
+ of the form
+ <screen>
+@prox <replaceable>exclusion</replaceable> <replaceable>distance</replaceable> <replaceable>ordered</replaceable> <replaceable>relation</replaceable> <replaceable>which-code</replaceable> <replaceable>unit-code</replaceable>
+ </screen>
+ in which the meanings of the parameters are as described in in
+ the standard, and they can take the following values:
+ <itemizedlist>
+ <listitem><formalpara><title>exclusion</title><para>
+ 0 = false (i.e. the proximity condition specified by the
+ remaining parameters must be satisfied) or
+ 1 = true (the proximity condition specified by the
+ remaining parameters must <emphasis>not</emphasis> be
+ satisifed).
+ </para></formalpara></listitem>
+ <listitem><formalpara><title>distance</title><para>
+ An integer specifying the difference between the locations
+ of the operands: e.g. two adjacent words would have
+ distance=1 since their locations differ by one unit.
+ </para></formalpara></listitem>
+ <listitem><formalpara><title>ordered</title><para>
+ 1 = ordered (the operands must occur in the order the
+ query specifies them) or
+ 0 = unordered (they may appear in either order).
+ </para></formalpara></listitem>
+ <listitem><formalpara><title>relation</title><para>
+ Recognised values are
+ 1 (lessThan),
+ 2 (lessThanOrEqual),
+ 3 (equal),
+ 4 (greaterThanOrEqual),
+ 5 (greaterThan) and
+ 6 (notEqual).
+ </para></formalpara></listitem>
+ <listitem><formalpara><title>which-code</title><para>
+ <literal>known</literal>
+ or
+ <literal>k</literal>
+ (the unit-code parameter is taken from the well-known list
+ of alternatives described in below) or
+ <literal>private</literal>
+ or
+ <literal>p</literal>
+ (the unit-code paramater has semantics specific to an
+ out-of-band agreement such as a profile).
+ </para></formalpara></listitem>
+ <listitem><formalpara><title>unit-code</title><para>
+ If the which-code parameter is <literal>known</literal>
+ then the recognised values are
+ 1 (character),
+ 2 (word),
+ 3 (sentence),
+ 4 (paragraph),
+ 5 (section),
+ 6 (chapter),
+ 7 (document),
+ 8 (element),
+ 9 (subelement),
+ 10 (elementType) and
+ 11 (byte).
+ If which-code is <literal>private</literal> then the
+ acceptable values are determined by the profile.
+ </para></formalpara></listitem>
+ </itemizedlist>
+ (The numeric values of the relation and well-known unit-code
+ parameters are taken straight from
+ <ulink url="http://lcweb.loc.gov/z3950/agency/asn1.html#ProximityOperator"
+ >the ASN.1</ulink> of the proximity structure in the standard.)
+ </para>
+ </sect3>
- @attr 1=/book/title computer
- </screen>
+ <sect3 id="pqf-examples"><title>PQF queries</title>
+ <para>Queries using simple terms.
+ <screen>
+ dylan
+ "bob dylan"
+ </screen>
+ </para>
+ <para>Boolean operators.
+ <screen>
+ @or "dylan" "zimmerman"
+ @and @or dylan zimmerman when
+ @and when @or dylan zimmerman
+ </screen>
+ </para>
+ <para>
+ Reference to result sets.
+ <screen>
+ @set Result-1
+ @and @set seta setb
+ </screen>
+ </para>
+ <para>
+ Attributes for terms.
+ <screen>
+ @attr 1=4 computer
+ @attr 1=4 @attr 4=1 "self portrait"
+ @attr exp1 @attr 1=1 CategoryList
+ @attr gils 1=2008 Copenhagen
+ @attr 1=/book/title computer
+ </screen>
+ </para>
+ <para>
+ Proximity.
+ <screen>
+ @prox 0 3 1 2 k 2 dylan zimmerman
+ </screen>
+ <note><para>
+ Here the parameters 0, 3, 1, 2, k and 2 represent exclusion,
+ distance, ordered, relation, which-code and unit-code, in that
+ order. So:
+ <itemizedlist>
+ <listitem><para>
+ exclusion = 0: the proximity condition must hold
+ </para></listitem>
+ <listitem><para>
+ distance = 3: the terms must be three units apart
+ </para></listitem>
+ <listitem><para>
+ ordered = 1: they must occur in the order they are specified
+ </para></listitem>
+ <listitem><para>
+ relation = 2: lessThanOrEqual (to the distance of 3 units)
+ </para></listitem>
+ <listitem><para>
+ which-code is ``known'', so the standard unit-codes are used
+ </para></listitem>
+ <listitem><para>
+ unit-code = 2: word.
+ </para></listitem>
+ </itemizedlist>
+ So the whole proximity query means that the words
+ <literal>dylan</literal> and <literal>zimmerman</literal> must
+ both occur in the record, in that order, differing in position
+ by three or fewer words (i.e. with two or fewer words between
+ them.) The query would find ``Bob Dylan, aka. Robert
+ Zimmerman'', but not ``Bob Dylan, born as Robert Zimmerman''
+ since the distance in this case is four.
+ </para></note>
+ </para>
+ <para>
+ Specifying term type.
+ <screen>
+ @term string "a UTF-8 string, maybe?"
+ </screen>
+ </para>
+ <para>Mixed queries
+ <screen>
+ @or @and bob dylan @set Result-1
+
+ @attr 4=1 @and @attr 1=1 "bob dylan" @attr 1=4 "slow train coming"
+
+ @and @attr 2=4 @attr gils 1=2038 -114 @attr 2=2 @attr gils 1=2039 -109
+ </screen>
+ <note>
+ <para>
+ The last of these examples is a spatial search: in
+ <ulink url="http://www.gils.net/prof_v2.html#sec_7_4"
+ >the GILS attribute set</ulink>,
+ access point
+ 2038 indicates West Bounding Coordinate and
+ 2030 indicates East Bounding Coordinate,
+ so the query is for areas extending from -114 degrees
+ to no more than -109 degrees.
+ </para>
+ </note>
+ </para>
+ </sect3>
</sect2>
- <sect2><title id="CCL">Common Command Language</title>
+ <sect2 id="CCL"><title>Common Command Language</title>
<para>
Not all users enjoy typing in prefix query structures and numerical
-- Proximity operator
</screen>
-
- <para>
- The following queries are all valid:
- </para>
-
- <screen>
- dylan
-
- "bob dylan"
-
- dylan or zimmerman
-
- set=1
-
- (dylan and bob) or set=1
-
- </screen>
- <para>
- Assuming that the qualifiers <literal>ti</literal>, <literal>au</literal>
- and <literal>date</literal> are defined we may use:
- </para>
-
- <screen>
- ti=self portrait
-
- au=(bob dylan and slow train coming)
-
- date>1980 and (ti=((self portrait)))
-
- </screen>
-
+
+ <example><title>CCL queries</title>
+ <para>
+ The following queries are all valid:
+ </para>
+
+ <screen>
+ dylan
+
+ "bob dylan"
+
+ dylan or zimmerman
+
+ set=1
+
+ (dylan and bob) or set=1
+
+ </screen>
+ <para>
+ Assuming that the qualifiers <literal>ti</literal>,
+ <literal>au</literal>
+ and <literal>date</literal> are defined we may use:
+ </para>
+
+ <screen>
+ ti=self portrait
+
+ au=(bob dylan and slow train coming)
+
+ date>1980 and (ti=((self portrait)))
+
+ </screen>
+ </example>
+
</sect3>
<sect3><title>CCL Qualifiers</title>
-
+
<para>
Qualifiers are used to direct the search to a particular searchable
index, such as title (ti) and author indexes (au). The CCL standard
</para>
<para>
- Consider a scenario where the target support ranked searches in the
- title-index. In this case, the user could specify
- </para>
-
- <screen>
- ti,ranked=knuth computer
- </screen>
- <para>
- and the <literal>ranked</literal> would map to relation=relevance
- (2=102) and the <literal>ti</literal> would map to title (1=4).
- </para>
-
- <para>
- A "profile" with a set predefined CCL qualifiers can be read from a
- file. The YAZ client reads its CCL qualifiers from a file named
+ A CCL profile is a set of predefined CCL qualifiers that may be
+ read from a file.
+ The YAZ client reads its CCL qualifiers from a file named
<filename>default.bib</filename>. Each line in the file has the form:
</para>
<para>
<replaceable>qualifier-name</replaceable>
- <replaceable>type</replaceable>=<replaceable>val</replaceable>
- <replaceable>type</replaceable>=<replaceable>val</replaceable> ...
+ [<replaceable>attributeset</replaceable><literal>,</literal>]<replaceable>type</replaceable><literal>=</literal><replaceable>val</replaceable>
+ [<replaceable>attributeset</replaceable><literal>,</literal>]<replaceable>type</replaceable><literal>=</literal><replaceable>val</replaceable> ...
</para>
<para>
where <replaceable>qualifier-name</replaceable> is the name of the
qualifier to be used (eg. <literal>ti</literal>),
- <replaceable>type</replaceable> is a BIB-1 category type and
- <replaceable>val</replaceable> is the corresponding BIB-1 attribute
- value.
- The <replaceable>type</replaceable> can be either numeric or it may be
- either <literal>u</literal> (use), <literal>r</literal> (relation),
- <literal>p</literal> (position), <literal>s</literal> (structure),
- <literal>t</literal> (truncation) or <literal>c</literal> (completeness).
- The <replaceable>qualifier-name</replaceable> <literal>term</literal>
- has a special meaning.
- The types and values for this definition is used when
- <emphasis>no</emphasis> qualifiers are present.
- </para>
-
- <para>
- Consider the following definition:
- </para>
-
- <screen>
- ti u=4 s=1
- au u=1 s=1
- term s=105
- </screen>
- <para>
- Two qualifiers are defined, <literal>ti</literal> and
- <literal>au</literal>.
- They both set the structure-attribute to phrase (1).
- <literal>ti</literal>
- sets the use-attribute to 4. <literal>au</literal> sets the
- use-attribute to 1.
- When no qualifiers are used in the query the structure-attribute is
- set to free-form-text (105).
+ <replaceable>type</replaceable> is attribute type in the attribute
+ set (Bib-1 is used if no attribute set is given) and
+ <replaceable>val</replaceable> is attribute value.
+ The <replaceable>type</replaceable> can be specified as an
+ integer or as it be specified either as a single-letter:
+ <literal>u</literal> for use,
+ <literal>r</literal> for relation,<literal>p</literal> for position,
+ <literal>s</literal> for structure,<literal>t</literal> for truncation
+ or <literal>c</literal> for completeness.
+ The attributes for the special qualifier name <literal>term</literal>
+ are used when no CCL qualifier is given in a query.
</para>
+ <example><title>CCL profile</title>
+ <para>
+ Consider the following definition:
+ </para>
+
+ <screen>
+ ti u=4 s=1
+ au u=1 s=1
+ term s=105
+ ranked r=102
+ </screen>
+ <para>
+ Three qualifiers are defined, <literal>ti</literal>,
+ <literal>au</literal> and <literal>ranked</literal>.
+ <literal>ti</literal> and <literal>au</literal> both set
+ structure attribute to phrase (s=1).
+ <literal>ti</literal>
+ sets the use-attribute to 4. <literal>au</literal> sets the
+ use-attribute to 1.
+ When no qualifiers are used in the query the structure-attribute is
+ set to free-form-text (105).
+ </para>
+ <para>
+ You can combine attributes. To Search for "ranked title" you
+ can do
+ <screen>
+ ti,ranked=knuth computer
+ </screen>
+ which will use "relation is ranked", "use is title", "structure is
+ phrase".
+ </para>
+ </example>
+
</sect3>
<sect3><title>CCL API</title>
<para>
</para>
</sect3>
</sect2>
+ <sect2 id="tools.cql"><title>CQL</title>
+ <para>
+ <ulink url="http://www.loc.gov/z3950/agency/zing/cql/">CQL</ulink>
+ - Common Query Language - was defined for the
+ <ulink url="http://www.loc.gov/z3950/agency/zing/srw/">SRW</ulink>
+ protocol.
+ In many ways CQL has a similar syntax to CCL.
+ The objective of CQL is different. Where CCL aims to be
+ an end-user language, CQL is <emphasis>the</emphasis> protocol
+ query language for SRW.
+ </para>
+ <tip>
+ <para>
+ If you are new to CQL, read the
+ <ulink url="http://zing.z3950.org/cql/intro.html">Gentle
+ Introduction</ulink>.
+ </para>
+ </tip>
+ <para>
+ The CQL parser in &yaz; provides the following:
+ <itemizedlist>
+ <listitem>
+ <para>
+ It parses and validates a CQL query.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ It generates a C structure that allows you to convert
+ a CQL query to some other query language, such as SQL.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ The parser converts a valid CQL query to PQF, thus providing a
+ way to use CQL for both SRW/SRU servers and Z39.50 targets at the
+ same time.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ The parser converts CQL to
+ <ulink url="http://www.loc.gov/z3950/agency/zing/cql/xcql.html">
+ XCQL</ulink>.
+ XCQL is an XML representation of CQL.
+ XCQL is part of the SRW specification. However, since SRU
+ supports CQL only, we don't expect XCQL to be widely used.
+ Furthermore, CQL has the advantage over XCQL that it is
+ easy to read.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+ <sect3 id="tools.cql.parsing"><title>CQL parsing</title>
+ <para>
+ A CQL parser is represented by the <literal>CQL_parser</literal>
+ handle. Its contents should be considered &yaz; internal (private).
+ <synopsis>
+#include <yaz/cql.h>
+
+typedef struct cql_parser *CQL_parser;
+
+CQL_parser cql_parser_create(void);
+void cql_parser_destroy(CQL_parser cp);
+ </synopsis>
+ A parser is created by <function>cql_parser_create</function> and
+ is destroyed by <function>cql_parser_destroy</function>.
+ </para>
+ <para>
+ To parse a CQL query string, the following function
+ is provided:
+ <synopsis>
+int cql_parser_string(CQL_parser cp, const char *str);
+ </synopsis>
+ A CQL query is parsed by the <function>cql_parser_string</function>
+ which takes a query <parameter>str</parameter>.
+ If the query was valid (no syntax errors), then zero is returned;
+ otherwise a non-zero error code is returned.
+ </para>
+ <para>
+ <synopsis>
+int cql_parser_stream(CQL_parser cp,
+ int (*getbyte)(void *client_data),
+ void (*ungetbyte)(int b, void *client_data),
+ void *client_data);
+
+int cql_parser_stdio(CQL_parser cp, FILE *f);
+ </synopsis>
+ The functions <function>cql_parser_stream</function> and
+ <function>cql_parser_stdio</function> parses a CQL query
+ - just like <function>cql_parser_string</function>.
+ The only difference is that the CQL query can be
+ fed to the parser in different ways.
+ The <function>cql_parser_stream</function> uses a generic
+ byte stream as input. The <function>cql_parser_stdio</function>
+ uses a <literal>FILE</literal> handle which is opened for reading.
+ </para>
+ </sect3>
+
+ <sect3 id="tools.cql.tree"><title>CQL tree</title>
+ <para>
+ The the query string is validl, the CQL parser
+ generates a tree representing the structure of the
+ CQL query.
+ </para>
+ <para>
+ <synopsis>
+struct cql_node *cql_parser_result(CQL_parser cp);
+ </synopsis>
+ <function>cql_parser_result</function> returns the
+ a pointer to the root node of the resulting tree.
+ </para>
+ <para>
+ Each node in a CQL tree is represented by a
+ <literal>struct cql_node</literal>.
+ It is defined as follows:
+ <synopsis>
+#define CQL_NODE_ST 1
+#define CQL_NODE_BOOL 2
+#define CQL_NODE_MOD 3
+struct cql_node {
+ int which;
+ union {
+ struct {
+ char *index;
+ char *term;
+ char *relation;
+ struct cql_node *modifiers;
+ struct cql_node *prefixes;
+ } st;
+ struct {
+ char *value;
+ struct cql_node *left;
+ struct cql_node *right;
+ struct cql_node *modifiers;
+ struct cql_node *prefixes;
+ } boolean;
+ struct {
+ char *name;
+ char *value;
+ struct cql_node *next;
+ } mod;
+ } u;
+};
+ </synopsis>
+ There are three kinds of nodes, search term (ST), boolean (BOOL),
+ and modifier (MOD).
+ </para>
+ <para>
+ The search term node has five members:
+ <itemizedlist>
+ <listitem>
+ <para>
+ <literal>index</literal>: index for search term.
+ If an index is unspecified for a search term,
+ <literal>index</literal> will be NULL.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <literal>term</literal>: the search term itself.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <literal>relation</literal>: relation for search term.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <literal>modifiers</literal>: relation modifiers for search
+ term. The <literal>modifiers</literal> is a simple linked
+ list (NULL for last entry). Each relation modifier node
+ is of type <literal>MOD</literal>.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <literal>prefixes</literal>: index prefixes for search
+ term. The <literal>prefixes</literal> is a simple linked
+ list (NULL for last entry). Each prefix node
+ is of type <literal>MOD</literal>.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+
+ <para>
+ The boolean node represents both <literal>and</literal>,
+ <literal>or</literal>, not as well as
+ proximity.
+ <itemizedlist>
+ <listitem>
+ <para>
+ <literal>left</literal> and <literal>right</literal>: left
+ - and right operand respectively.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <literal>modifiers</literal>: proximity arguments.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <literal>prefixes</literal>: index prefixes.
+ The <literal>prefixes</literal> is a simple linked
+ list (NULL for last entry). Each prefix node
+ is of type <literal>MOD</literal>.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+
+ <para>
+ The modifier node is a "utility" node used for name-value pairs,
+ such as prefixes, proximity arguements, etc.
+ <itemizedlist>
+ <listitem>
+ <para>
+ <literal>name</literal> name of mod node.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <literal>value</literal> value of mod node.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <literal>next</literal>: pointer to next node which is
+ always a mod node (NULL for last entry).
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+
+ </sect3>
+ <sect3 id="tools.cql.pqf"><title>CQL to PQF conversion</title>
+ <para>
+ Conversion to PQF (and Z39.50 RPN) is tricky by the fact
+ that the resulting RPN depends on the Z39.50 target
+ capabilities (combinations of supported attributes).
+ In addition, the CQL and SRW operates on index prefixes
+ (URI or strings), whereas the RPN uses Object Identifiers
+ for attribute sets.
+ </para>
+ <para>
+ The CQL library of &yaz; defines a <literal>cql_transform_t</literal>
+ type. It represents a particular mapping between CQL and RPN.
+ This handle is created and destroyed by the functions:
+ <synopsis>
+cql_transform_t cql_transform_open_FILE (FILE *f);
+cql_transform_t cql_transform_open_fname(const char *fname);
+void cql_transform_close(cql_transform_t ct);
+ </synopsis>
+ The first two functions create a tranformation handle from
+ either an already open FILE or from a filename respectively.
+ </para>
+ <para>
+ The handle is destroyed by <function>cql_transform_close</function>
+ in which case no further reference of the handle is allowed.
+ </para>
+ <para>
+ When a <literal>cql_transform_t</literal> handle has been created
+ you can convert to RPN.
+ <synopsis>
+int cql_transform_buf(cql_transform_t ct,
+ struct cql_node *cn, char *out, int max);
+ </synopsis>
+ This function converts the CQL tree <literal>cn</literal>
+ using handle <literal>ct</literal>.
+ For the resulting PQF, you supply a buffer <literal>out</literal>
+ which must be able to hold at at least <literal>max</literal>
+ characters.
+ </para>
+ <para>
+ If conversion failed, <function>cql_transform_buf</function>
+ returns a non-zero SRW error code; otherwise zero is returned
+ (conversion successful). The meanings of the numeric error
+ codes are listed in the SRW specifications at
+ <ulink url="http://www.loc.gov/srw/diagnostic-list.html"/>
+ </para>
+ <para>
+ If conversion fails, more information can be obtained by calling
+ <synopsis>
+int cql_transform_error(cql_transform_t ct, char **addinfop);
+ </synopsis>
+ This function returns the most recently returned numeric
+ error-code and sets the string-pointer at
+ <literal>*addinfop</literal> to point to a string containing
+ additional information about the error that occurred: for
+ example, if the error code is 15 (``Illegal or unsupported index
+ set''), the additional information is the name of the requested
+ index set that was not recognised.
+ </para>
+ <para>
+ If you wish to be able to produce a PQF result in a different
+ way, there are two alternatives.
+ <synopsis>
+void cql_transform_pr(cql_transform_t ct,
+ struct cql_node *cn,
+ void (*pr)(const char *buf, void *client_data),
+ void *client_data);
+
+int cql_transform_FILE(cql_transform_t ct,
+ struct cql_node *cn, FILE *f);
+ </synopsis>
+ The former function produces output to a user-defined
+ output stream. The latter writes the result to an already
+ open <literal>FILE</literal>.
+ </para>
+ </sect3>
+ <sect3 id="tools.cql.map">
+ <title>Specification of CQL to RPN mapping</title>
+ <para>
+ The file supplied to functions
+ <function>cql_transform_open_FILE</function>,
+ <function>cql_transform_open_fname</function> follows
+ a structure found in many Unix utilities.
+ It consists of mapping specifications - one per line.
+ Lines starting with <literal>#</literal> are ignored (comments).
+ </para>
+ <para>
+ Each line is of the form
+ <literallayout>
+ <replaceable>CQL pattern</replaceable><literal> = </literal> <replaceable> RPN equivalent</replaceable>
+ </literallayout>
+ </para>
+ <para>
+ An RPN pattern is a simple attribute list. Each attribute pair
+ takes the form:
+ <literallayout>
+ [<replaceable>set</replaceable>] <replaceable>type</replaceable><literal>=</literal><replaceable>value</replaceable>
+ </literallayout>
+ The attribute <replaceable>set</replaceable> is optional.
+ The <replaceable>type</replaceable> is the attribute type,
+ <replaceable>value</replaceable> the attribute value.
+ </para>
+ <para>
+ The following CQL patterns are recognized:
+ <variablelist>
+ <varlistentry><term>
+ <literal>qualifier.</literal><replaceable>set</replaceable><literal>.</literal><replaceable>name</replaceable>
+ </term>
+ <listitem>
+ <para>
+ This pattern is invoked when a CQL qualifier, such as
+ dc.title is converted. <replaceable>set</replaceable>
+ and <replaceable>name</replaceable> is the index set and qualifier
+ name respectively.
+ Typically, the RPN specifies an equivalent use attribute.
+ </para>
+ <para>
+ For terms not bound by a qualifier the pattern
+ <literal>qualifier.srw.serverChoice</literal> is used.
+ Here, the prefix <literal>srw</literal> is defined as
+ <literal>http://www.loc.gov/zing/cql/srw-indexes/v1.0/</literal>.
+ If this pattern is not defined, the mapping will fail.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry><term>
+ <literal>relation.</literal><replaceable>relation</replaceable>
+ </term>
+ <listitem>
+ <para>
+ This pattern specifies how a CQL relation is mapped to RPN.
+ <replaceable>pattern</replaceable> is name of relation
+ operator. Since <literal>=</literal> is used as
+ separator between CQL pattern and RPN, CQL relations
+ including <literal>=</literal> cannot be
+ used directly. To avoid a conflict, the names
+ <literal>ge</literal>,
+ <literal>eq</literal>,
+ <literal>le</literal>,
+ must be used for CQL operators, greater-than-or-equal,
+ equal, less-than-or-equal respectively.
+ The RPN pattern is supposed to include a relation attribute.
+ </para>
+ <para>
+ For terms not bound by a relation, the pattern
+ <literal>relation.scr</literal> is used. If the pattern
+ is not defined, the mapping will fail.
+ </para>
+ <para>
+ The special pattern, <literal>relation.*</literal> is used
+ when no other relation pattern is matched.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry><term>
+ <literal>relationModifier.</literal><replaceable>mod</replaceable>
+ </term>
+ <listitem>
+ <para>
+ This pattern specifies how a CQL relation modifier is mapped to RPN.
+ The RPN pattern is usually a relation attribute.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry><term>
+ <literal>structure.</literal><replaceable>type</replaceable>
+ </term>
+ <listitem>
+ <para>
+ This pattern specifies how a CQL structure is mapped to RPN.
+ Note that this CQL pattern is somewhat to similar to
+ CQL pattern <literal>relation</literal>.
+ The <replaceable>type</replaceable> is a CQL relation.
+ </para>
+ <para>
+ The pattern, <literal>structure.*</literal> is used
+ when no other structure pattern is matched.
+ Usually, the RPN equivalent specifies a structure attribute.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry><term>
+ <literal>position.</literal><replaceable>type</replaceable>
+ </term>
+ <listitem>
+ <para>
+ This pattern specifies how the anchor (position) of
+ CQL is mapped to RPN.
+ The <replaceable>type</replaceable> is one
+ of <literal>first</literal>, <literal>any</literal>,
+ <literal>last</literal>, <literal>firstAndLast</literal>.
+ </para>
+ <para>
+ The pattern, <literal>position.*</literal> is used
+ when no other position pattern is matched.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry><term>
+ <literal>set.</literal><replaceable>prefix</replaceable>
+ </term>
+ <listitem>
+ <para>
+ This specification defines a CQL index set for a given prefix.
+ The value on the right hand side is the URI for the set -
+ <emphasis>not</emphasis> RPN. All prefixes used in
+ qualifier patterns must be defined this way.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ <example><title>CQL to RPN mapping file</title>
+ <para>
+ This simple file defines two index sets, three qualifiers and three
+ relations, a position pattern and a default structure.
+ </para>
+ <programlisting><![CDATA[
+ set.srw = http://www.loc.gov/zing/cql/srw-indexes/v1.0/
+ set.dc = http://www.loc.gov/zing/cql/dc-indexes/v1.0/
+
+ qualifier.srw.serverChoice = 1=1016
+ qualifier.dc.title = 1=4
+ qualifier.dc.subject = 1=21
+
+ relation.< = 2=1
+ relation.eq = 2=3
+ relation.scr = 2=3
+
+ position.any = 3=3 6=1
+
+ structure.* = 4=1
+]]>
+ </programlisting>
+ <para>
+ With the mappings above, the CQL query
+ <screen>
+ computer
+ </screen>
+ is converted to the PQF:
+ <screen>
+ @attr 1=1016 @attr 2=3 @attr 4=1 @attr 3=3 @attr 6=1 "computer"
+ </screen>
+ by rules <literal>qualifier.srw.serverChoice</literal>,
+ <literal>relation.scr</literal>, <literal>structure.*</literal>,
+ <literal>position.any</literal>.
+ </para>
+ <para>
+ CQL query
+ <screen>
+ computer^
+ </screen>
+ is rejected, since <literal>position.right</literal> is
+ undefined.
+ </para>
+ <para>
+ CQL query
+ <screen>
+ >my = "http://www.loc.gov/zing/cql/dc-indexes/v1.0/" my.title = x
+ </screen>
+ is converted to
+ <screen>
+ @attr 1=4 @attr 2=3 @attr 4=1 @attr 3=3 @attr 6=1 "x"
+ </screen>
+ </para>
+ </example>
+ </sect3>
+ <sect3 id="tools.cql.xcql"><title>CQL to XCQL conversion</title>
+ <para>
+ Conversion from CQL to XCQL is trivial and does not
+ require a mapping to be defined.
+ There three functions to choose from depending on the
+ way you wish to store the resulting output (XML buffer
+ containing XCQL).
+ <synopsis>
+int cql_to_xml_buf(struct cql_node *cn, char *out, int max);
+void cql_to_xml(struct cql_node *cn,
+ void (*pr)(const char *buf, void *client_data),
+ void *client_data);
+void cql_to_xml_stdio(struct cql_node *cn, FILE *f);
+ </synopsis>
+ Function <function>cql_to_xml_buf</function> converts
+ to XCQL and stores result in a user supplied buffer of a given
+ max size.
+ </para>
+ <para>
+ <function>cql_to_xml</function> writes the result in
+ a user defined output stream.
+ <function>cql_to_xml_stdio</function> writes to a
+ a file.
+ </para>
+ </sect3>
+ </sect2>
</sect1>
<sect1 id="tools.oid"><title>Object Identifiers</title>