<!ENTITY % common SYSTEM "common/common.ent">
%common;
]>
-<!-- $Id: book.xml,v 1.4 2007-01-13 05:48:41 quinn Exp $ -->
+<!-- $Id: book.xml,v 1.5 2007-01-19 18:28:08 quinn Exp $ -->
<book id="book">
- <bookinfo>
- <title>Pazpar2 - User's Guide and Reference</title>
- <author>
- <firstname>Sebastian</firstname><surname>Hammer</surname>
- </author>
- <copyright>
- <year>©right-year;</year>
- <holder>Index Data</holder>
- </copyright>
- <abstract>
- <simpara>
- Pazpar2 - High-performance, user-interface independent, metasearching
- middleware featuring record merging, relevance ranking, and faceted search
- results.
- </simpara>
- <simpara>
- This document is a guide and reference to Pazpar version &version;.
- </simpara>
- <simpara>
- <inlinemediaobject>
- <imageobject>
- <imagedata fileref="common/id.png" format="PNG"/>
- </imageobject>
- <imageobject>
- <imagedata fileref="common/id.eps" format="EPS"/>
- </imageobject>
- </inlinemediaobject>
- </simpara>
- </abstract>
- </bookinfo>
-
- <chapter id="introduction">
- <title>Introduction</title>
- <para>
- Pazpar2 is a stand-alone package which implements
- the best we know to do in terms of the core metasearching
- functionality; that is, searching a number of databases in parallel,
- merging, and analyzing the results. Additional functionality such as
- user management, attractive displays are expected to be implemented by
- applications that use pazpar2. Pazpar2 is user interface independent.
- Its functionality is exposed through a simple REST-style webservice API,
- designed to be simple to use from an Ajax-anbled browser, from a
- higher-level server-side language like PHP or Java, or even from a Flash
- application.
- </para>
- <para>
- Once you launch a search in pazpar2, the operation continues behind the
- scenes. Pazpar2 connects to servers, carries out searches, and
- retrieves, deduplicates, and stores results internally. Your application
- code may periodically inquire about the status of an ongoing operation,
- and ask to see records or other result set facets.
- </para>
- <para>
- Pazpar2 is designed to be highly configurable. Incoming records are
- normalized to XML/UTF-8, and then further normalized using XSLT to a
- simple internal representation that is suitable for analysis. By
- providing XSLT stylesheets for different kinds of result records, you
- can tune pazpar2 to work against different kinds of information
- retrieval servers. Finally, metadata is extracted, in a configurable
- way, from this internal record, to support display, merging, ranking,
- result set facets, and sorting. Pazpar2 is not bound to a specific model
- of metadata, such as DublinCore or MARC -- by providing the right
- configuration, it can work with a number of different kinds of data in
- support of many different applications.
- </para>
- <para>
- Pazpar2 is designed to be efficient and scalable. You can set it up to
- search several hundred targets in parallel, or you can use it to support
- hundreds of concurrent users. It is implemented with the same attention
- to performance and economy that we use in our indexing engines, so that
- you can focus on building your application. You can devote all of your
- attention to usability and let pazpar2 do what it does best -- search.
- </para>
- </chapter>
-
- <chapter id="license">
- <title>Pazpar2 License</title>
- <para>To be decided and written.</para>
- </chapter>
-
- <chapter id="installation">
- <title>Installation</title>
- <para>
- Pazpar2 depends on the following tools/libraries:
- <variablelist>
- <varlistentry><term><ulink url="&url.yaz;">YAZ</ulink></term>
- <listitem>
- <para>
- The popular Z39.50 toolkit for the C language. YAZ must be
- compiled with Libxml2/Libxslt support.
- </para>
- </listitem>
- </varlistentry>
- </variablelist>
- </para>
- <para>
- In order to compile Pazpar2 an ANSI C compiler is
- required. The requirements should be the same as for YAZ.
- </para>
-
- <section id="installation.unix">
- <title>Installation on Unix (from Source)</title>
+ <bookinfo>
+ <title>Pazpar2 - User's Guide and Reference</title>
+ <author>
+ <firstname>Sebastian</firstname><surname>Hammer</surname>
+ </author>
+ <copyright>
+ <year>©right-year;</year>
+ <holder>Index Data</holder>
+ </copyright>
+ <abstract>
+ <simpara>
+ Pazpar2 is a high-performance, user interface-independent, data
+ model-independent metasearching
+ middleware featuring merging, relevance ranking, record sorting,
+ and faceted results.
+ </simpara>
+ <simpara>
+ This document is a guide and reference to Pazpar version &version;.
+ </simpara>
+ <simpara>
+ <inlinemediaobject>
+ <imageobject>
+ <imagedata fileref="common/id.png" format="PNG"/>
+ </imageobject>
+ <imageobject>
+ <imagedata fileref="common/id.eps" format="EPS"/>
+ </imageobject>
+ </inlinemediaobject>
+ </simpara>
+ </abstract>
+ </bookinfo>
+
+ <chapter id="introduction">
+ <title>Introduction</title>
<para>
- Here is a quick step-by-step guide on how to compile the
- tools that Pazpar2 uses. Only few systems have none of the required
- tools binary packages. If, for example, Libxml2/libxslt are already
- installed as development packages use these.
+ Pazpar2 is a stand-alone metasearch client with a webservice API, designed
+ to be used either from a browser-based client (JavaScript, Flash, Java,
+ etc.), from from server-side code, or any combination of the two.
+ Pazpar2 is a highly optimized client designed to
+ search many resources in parallel. It implements record merging,
+ relevance-ranking and sorting by arbitrary data content, and facet
+ analysis for browsing purposes. It is designed to be data model
+ independent, and is capable of working with MARC, DublinCore, or any
+ other XML-structured response format -- XSLT is used to normalize and extract
+ data from retrieval records for display and analysis. It can be used
+ against any server which supports the Z39.50 protocol. Proprietary
+ backend modules can be used to support a large number of other protocols
+ (please contact Index Data for further information about this).
</para>
-
<para>
- Ensure that the development libraries + header files are
- available on your system before compiling Pazpar2. For installation
- of YAZ, refer to the YAZ installation chapter.
+ Additional functionality such as
+ user management, attractive displays are expected to be implemented by
+ applications that use pazpar2. Pazpar2 is user interface independent.
+ Its functionality is exposed through a simple REST-style webservice API,
+ designed to be simple to use from an Ajax-enbled browser, Flash
+ animation, Java applet, etc., or from a higher-level server-side language
+ like PHP or Java. Because session information can be shared between
+ browser-based logic and your server-side scripting, there is tremendous
+ flexibility in how you implement your business logic on top of pazpar2.
</para>
- <screen>
- gunzip -c pazpar2-version.tar.gz|tar xf -
- cd pazpar2-version
- ./configure
- make
- su
- make install
- </screen>
- </section>
-
- <section id="installation.debian">
- <title>Installation on Debian GNU/Linux</title>
<para>
- All dependencies for Pazpar2 are available as
- <ulink url="&url.debian;">Debian</ulink>
- packages for the sarge (stable in 2005) and etch (testing in 2005)
- distributions.
+ Once you launch a search in pazpar2, the operation continues behind the
+ scenes. Pazpar2 connects to servers, carries out searches, and
+ retrieves, deduplicates, and stores results internally. Your application
+ code may periodically inquire about the status of an ongoing operation,
+ and ask to see records or other result set facets. Result become
+ available immediately, and it is easy to build end-user interfaces which
+ feel extremely responsive, even when searching more than 100 servers
+ concurrently.
</para>
<para>
- The procedures for Debian based systems, such as
- <ulink url="&url.ubuntu;">Ubuntu</ulink> is probably similar
+ Pazpar2 is designed to be highly configurable. Incoming records are
+ normalized to XML/UTF-8, and then further normalized using XSLT to a
+ simple internal representation that is suitable for analysis. By
+ providing XSLT stylesheets for different kinds of result records, you
+ can tune pazpar2 to work against different kinds of information
+ retrieval servers. Finally, metadata is extracted, in a configurable
+ way, from this internal record, to support display, merging, ranking,
+ result set facets, and sorting. Pazpar2 is not bound to a specific model
+ of metadata, such as DublinCore or MARC -- by providing the right
+ configuration, it can work with a number of different kinds of data in
+ support of many different applications.
</para>
- <screen>
- apt-get install libyaz-dev
- </screen>
<para>
- With these packages installed, the usual configure + make
- procedure can be used for Pazpar2 as outlined in
- <xref linkend="installation.unix"/>.
+ Pazpar2 is designed to be efficient and scalable. You can set it up to
+ search several hundred targets in parallel, or you can use it to support
+ hundreds of concurrent users. It is implemented with the same attention
+ to performance and economy that we use in our indexing engines, so that
+ you can focus on building your application, without worrying about the
+ details of metasearch logic. You can devote all of your attention to
+ usability and let pazpar2 do what it does best -- metasearch.
+ </para>
+ <para>
+ If you wish to connect to commercial or other databases which do not
+ support open standards, please contact Index Data. We have a licensing
+ agreement with a third party vendor which will enable pazpar2 to access
+ thousands of online databases, in addition the vast number of catalogs
+ and online services that support the Z39.50 protocol.
+ </para>
+ <para>
+ Pazpar2 is our attempt to re-think the traditional paradigms for
+ implementing and deploying metasearch logic, with an uncompromising
+ approach to performance, and attempting to make maximum use of the
+ capabilities of modern browsers. The demo user interface that
+ accompanies the distribution is but one example. If you think of new
+ ways of using pazpar2, we hope you'll share them with us, and if we
+ can provide assistance with regards to training, design, programming,
+ integration with different backends, hosting, or support, please don't
+ hesitate to contact us. If you'd like to see functionality in pazpar2
+ that is not there today, please don't hesitate to contact us. It may
+ already be in our development pipeline, or there might be a
+ possibility for you to help out by sponsoring development time or
+ code. Either way, get in touch and we will give you straight answers.
+ </para>
+ <para>
+ Enjoy!
+ </para>
+ </chapter>
+
+
+ <chapter id="license">
+ <title>Pazpar2 License</title>
+ <para>To be decided and written.</para>
+ </chapter>
+
+ <chapter id="installation">
+ <title>Installation</title>
+ <para>
+ Pazpar2 depends on the following tools/libraries:
+ <variablelist>
+ <varlistentry><term><ulink url="&url.yaz;">YAZ</ulink></term>
+ <listitem>
+ <para>
+ The popular Z39.50 toolkit for the C language. YAZ must be
+ compiled with Libxml2/Libxslt support.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
</para>
- </section>
- </chapter>
-
- <reference id="reference">
- <title>Reference</title>
- <partintro>
<para>
- The material in this chapter is drawn directly from the individual
- manual entries.
+ In order to compile Pazpar2 an ANSI C compiler is
+ required. The requirements should be the same as for YAZ.
</para>
- </partintro>
- &manref;
- </reference>
+
+ <section id="installation.unix">
+ <title>Installation on Unix (from Source)</title>
+ <para>
+ Here is a quick step-by-step guide on how to compile the
+ tools that Pazpar2 uses. Only few systems have none of the required
+ tools binary packages. If, for example, Libxml2/libxslt are already
+ installed as development packages use these.
+ </para>
+
+ <para>
+ Ensure that the development libraries + header files are
+ available on your system before compiling Pazpar2. For installation
+ of YAZ, refer to the YAZ installation chapter.
+ </para>
+ <screen>
+ gunzip -c pazpar2-version.tar.gz|tar xf -
+ cd pazpar2-version
+ ./configure
+ make
+ su
+ make install
+ </screen>
+ </section>
+
+ <section id="installation.debian">
+ <title>Installation on Debian GNU/Linux</title>
+ <para>
+ All dependencies for Pazpar2 are available as
+ <ulink url="&url.debian;">Debian</ulink>
+ packages for the sarge (stable in 2005) and etch (testing in 2005)
+ distributions.
+ </para>
+ <para>
+ The procedures for Debian based systems, such as
+ <ulink url="&url.ubuntu;">Ubuntu</ulink> is probably similar
+ </para>
+ <screen>
+ apt-get install libyaz-dev
+ </screen>
+ <para>
+ With these packages installed, the usual configure + make
+ procedure can be used for Pazpar2 as outlined in
+ <xref linkend="installation.unix"/>.
+ </para>
+ </section>
+ </chapter>
+
+ <chapter id="using">
+ <title>Using pazpar2</title>
+ <para>
+ This chapter provides a general introduction to the use and deployment of pazpar2.
+ </para>
+
+ <section id="architecture">
+ <title>Pazpar2 and your systems architecture</title>
+ <para>
+ Pazpar2 is designed to provide asynchronous, behind-the-scenes
+ metasearching functionality to your application, exposing this
+ functionality using a simple webservice API that can be accessed
+ from any number of development environments. In particular, it is
+ possible to combine pazpar2 either with your server-side dynamic
+ website scripting, with scripting or code running in the browser, or
+ with any combination of the two. Pazpar2 is an excellent tool for
+ building advanced, Ajax-based user interfaces for metasearch
+ functionality, but it isn't a requirement -- you can choose to use
+ pazpar2 entirely as a backend to your regular server-side scripting.
+ When you do use pazpar2 in conjunction
+ with browser scripting (JavaScript/Ajax, Flash, applets, etc.), there are
+ special considerations.
+ </para>
+
+ <para>
+ Pazpar2 implements a simple but efficient HTTP server, and it is
+ designed to interact directly with scripting running in the browser
+ for the best possible performance, and to limit overhead when
+ several browser clients generate numerous webservice requests.
+ However, it is still desirable to use a conventional webserver,
+ such as Apache, to serve up graphics, HTML documents, and
+ server-side scripting. Because the security sandbox environment of
+ most browser-side programming environments only allows communication
+ with the server from which the enclosing HTML page or object
+ originated, pazpar2 is designed so that it can act as a transparent
+ proxy in front of an existing webserver (see <xref
+ linkend="pazpar2_conf"/> for details). In this mode, all regular
+ HTTP requests are transparently passed through to your webserver,
+ while pazpar2 only intercepts search-related webservice requests.
+ </para>
+
+ <para>
+ If you want to expose your combined service on port 80, you can
+ either run your regular webserver on a different port, a different
+ server, or a different IP address associated with the same server.
+ </para>
+
+ <para>
+ Sometimes, it may be necessary to implement functionality on your
+ regular webserver that makes use of search results, for example to
+ implement data import functionality, emailing results, history
+ lists, personal citation lists, interlibrary loan functionality
+ ,etc. Fortunately, it is simple to exchange information between
+ pazpar2, your browser scripting, and backend server-side scripting.
+ You can send a session ID and possibly a record ID from your browser
+ code to your server code, and from there use pazpar2s webservice API
+ to access result sets or individual records. You could even 'hide'
+ all of pazpar2s functionality between your own API implemented on
+ the server-side, and access that from the browser or elsewhere. The
+ possibilities are just about endless.
+ </para>
+ </section>
+
+ <section id="data_model">
+ <title>Your data model</title>
+ <para>
+ Pazpar2 does not have a preconceived model of what makes up a data
+ model. There are no assumption that records have specific fields or
+ that they are organized in any particular way. The only assumption
+ is that data comes packaged in a form that the software can work
+ with (presently, that means XML or MARC), and that you can provide
+ the necessary information to massage it into pazpar2's internal
+ record abstraction.
+ </para>
+
+ <para>
+ Handling retrieval records in pazpar2 is a two-step process. First,
+ you decide which data elements of the source record you are
+ interested in, and you specify any desired massaging or combining of
+ elements using an XSLT stylesheet (MARC records are automatically
+ normalized to MARCXML before this step). If desired, you can run
+ multiple XSLT stylesheets in series to accomplish this, but the
+ output of the last one should be a representation of the record in a
+ schema that pazpar2 understands.
+ </para>
+
+ <para>
+ The intermediate, internal representation of the record looks like
+ this:
+ <screen><![CDATA[
+<record xmlns="http://www.indexdata.com/pazpar2/1.0"
+ mergekey="title The Shining author King, Stephen">
+
+ <metadata type="title">The Shining</metadata>
+
+ <metadata type="author">King, Stephen</metadata>
+
+ <metadata type="kind">ebook</metadata>
+
+ <!-- ... and so on -->
+</record>
+]]></screen>
+
+ As you can see, there isn't much to it. There are really only a few
+ important elements to this file.
+ </para>
+
+ <para>
+ Elements should belong to the namespace
+ http://www.indexdata.com/pazpar2/1.0. If the root node contains the
+ attribute 'mergekey', then every record that generates the same
+ merge key (normalized for case differences, white space, and
+ truncation) will be joined into a cluster. In other words, you
+ decide how records are merged. If you don't include a merge key,
+ records are never merged. The 'metadata' elements provide the meat
+ of the elements -- the content. the 'type' attribute is used to
+ match each element against processing rules that determine what
+ happens to the data element next.
+ </para>
+
+ <para>
+ The next processing step is the extraction of metadata from the
+ intermediate representation of the record. This is governed by the
+ 'metadata' elements in the 'service' section of the configuration
+ file. See <xref linkend="config-server"/> for details. The metadata
+ in the retrieval record ultimately drives merging, sorting, ranking,
+ the extraction of browse facets, and display, all configurable.
+ </para>
+ </section>
+
+ <section id="client">
+ <title>Client development</title>
+ <para>
+ You can use pazpar2 from any environment that allows you to use
+ webservices. The initial goal of the software was to support
+ Ajax-based applications, but there literally are no limits to what
+ you can do. You can use pazpar2 from Javascript, Flash, Java, etc.,
+ on the browser side, and from any development environment on the
+ server side, and you can pass session tokens and record IDs freely
+ around between these environments to build sophisticated applications.
+ Use your imagination.
+ </para>
+
+ <para>
+ The webservice API of pazpar2 is described in detail in <xref
+ linkend="pazpar2_protocol"/>.
+ </para>
+
+ <para>
+ In brief, you use the 'init' command to create a session, a
+ temporary workspace which carries information about the current
+ search. You start a new search using the 'search' command. Once the
+ search has been started, you can follow its progress using the
+ 'stat', 'bytarget', 'termlist', or 'show' commands. Detailed records
+ can be fetched using the 'record' command.
+ </para>
+ </section>
+ </chapter> <!-- Using pazpar2 -->
+
+ <reference id="reference">
+ <title>Reference</title>
+ <partintro>
+ <para>
+ The material in this chapter is drawn directly from the individual
+ manual entries.
+ </para>
+ </partintro>
+ &manref;
+ </reference>
</book>
<!-- Keep this comment at the end of the file
<!ENTITY % common SYSTEM "common/common.ent">
%common;
]>
-<!-- $Id: pazpar2_conf.xml,v 1.2 2007-01-12 15:31:30 adam Exp $ -->
+<!-- $Id: pazpar2_conf.xml,v 1.3 2007-01-19 18:28:08 quinn Exp $ -->
<refentry id="pazpar2_conf">
<refentryinfo>
<productname>Pazpar2</productname>
</refsynopsisdiv>
<refsect1><title>DESCRIPTION</title>
- <para></para>
+ <para>
+ The pazpar2 configuration file, together with any referenced XSLT files,
+ govern pazpar2's behavior as a client, and control the normalization and
+ extraction of data elements from incoming result records, for the
+ purposes of merging, sorting, facet analysis, and display.
+ </para>
+
+ <para>
+ The file is specified using the option -f on the pazpar2 command line.
+ There is not presently a way to reload the configuration file without
+ restarting pazpar2, although this will most likely be added some time
+ in the future.
+ </para>
</refsect1>
+
+ <refsect1><title>FORMAT</title>
+ <para>
+ The configuration file is XML-structured. It must be valid XML. All
+ elements specific to pazpar2 should belong to the namespace
+ "http://www.indexdata.com/pazpar2/1.0" (this is assumed in the
+ following examples). The root element is named 'pazpar2'. Under the
+ root element are a number of elements which group categories of
+ information. The categories are described below.
+ </para>
+
+ <refsect2 id="config-server"><title>server</title>
+ <para>
+ This section governs overall behavior of the client. The data
+ elements are described below.
+ </para>
+ <variablelist> <!-- level 1 -->
+ <varlistentry>
+ <term>listen</term>
+ <listitem>
+ <para>
+ Configures the webservice -- this controls how you can connect
+ to pazpar2 from your browser or server-side code. The
+ attributes 'host' and 'port' control the binding of the
+ server. The 'host' attribute can be used to bind the server to
+ a secondary IP address of your system, enabling you to run
+ pazpar2 on port 80 alongside a conventional web server. You
+ can override this setting on the command lineusing the option -h.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>proxy</term>
+ <listitem>
+ <para>
+ If this item is given, pazpar2 will forward all incoming HTTP
+ requests that do not contain the filename 'search.pz2' to the
+ host and port specified using the 'host' and 'port'
+ attributes. This functionality is crucial if you wish to use
+ pazpar2 in conjunction with browser-based code (JS, Flash,
+ applets, etc.) which operates in a security sandbox. Such code
+ can only connect to the same server from which the enclosing
+ HTML page originated. Pazpar2s proxy functionality enables you
+ to host all of the main pages (plus images, CSS, etc) of your
+ application on a conventional webserver, while efficiently
+ processing webservice requests for metasearch status, results,
+ etc.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>service</term>
+ <listitem>
+ <para>
+ This nested element controls the behavior of pazpar2 with
+ respect to your data model. In pazpar2, incoming records are
+ normalized, using XSLT, into an internal representation (see
+ the <link
+ id="config-retrievalprofile">retrievalprofile</link> secion.
+ The 'service' section controls the further processing and
+ extraction of data from the internal representation, primarily
+ through the 'metdata' sub-element.
+ </para>
+
+ <variablelist> <!-- Level 2 -->
+ <varlistentry><term>metadata</term>
+ <para>
+ One of these elements is required for every data element in
+ the internal representation of the record (see
+ <xref linkend="data_model"/>. It governs
+ subsequent processing as pertains to sorting, relevance
+ ranking, merging, and display of data elements. It supports
+ the following attributes:
+ </para>
+
+ <variablelist> <!-- level 3 -->
+ <varlistentry><term>name</term>
+ <listentry>
+ <para>
+ This is the name of the data element. It is matched
+ against the 'type' attribute of the 'metadata' element
+ in the normalized record. A warning is produced if
+ metdata elements with an unknown name are found in the
+ normalized record. This name is also used to represent
+ data elements in the records returned by the
+ webservice API, and to name sort lists and browse
+ facets.
+ </para>
+ </listentry>
+ </varlistentry>
+
+ <varlistentry><term>type</term>
+ <listentry>
+ <para>
+ The type of data element. This value governs any
+ normalization or special processing that might take
+ place on an element. Possible values are 'generic'
+ (basic string), 'year' (a range is computed if
+ multiple years are found in the record). Note: This
+ list is likely to increase in the future.
+ </para>
+ </listentry>
+ </varlistentry>
+
+ <varlistentry><term>brief</term>
+ <listentry>
+ <para>
+ If this is set to 'yes', then the data element is
+ includes in brief records in the webservice API. Note
+ that this only makes sense for metadata elements that
+ are merged (see below). The default value is 'no'.
+ </para>
+ </listentry>
+ </varlistentry>
+
+ <varlistentry><term>sortkey</term>
+ <listentry>
+ <para>
+ Specifies that this data element is to be used for
+ sorting. The possible values are 'numeric' (numeric
+ value), 'skiparticle' (string; skip common, leading
+ articles), and 'no' (no sorting). The default value is
+ 'no'.
+ </para>
+ </listentry>
+ </varlistentry>
+
+ <varlistentry><term>rank</term>
+ <listentry>
+ <para>
+ Specifies that this element is to be used to help rank
+ records against the user's query (when ranking is
+ requested). The value is an integer, used as a
+ multiplier against the basic TF*IDF score. A value of
+ 1 is the base, higher values give additional weight to
+ elements of this type. The default is '0', which
+ excludes this element from the rank calculation.
+ </para>
+ </listentry>
+ </varlistentry>
+
+ <varlistentry><term>termlist</term>
+ <listentry>
+ <para>
+ Specifies that this element is to be used as a
+ termlist, or browse facet. Values are tabulated from
+ incoming records, and a highscore of values (with
+ their associated frequency) is made available to the
+ client through the webservice API. The possible values
+ are 'yes' and 'no' (default).
+ </para>
+ </listentry>
+ </varlistentry>
+
+ <varlistentry><term>merge</term>
+ <listentry>
+ <para>
+ This governs whether, and how elements are extracted
+ from individual records and merged into cluster
+ records. The possible values are: 'unique' (include
+ all unique elements), 'longest' (include only the
+ longest element (strlen), 'range' (calculate a range
+ of values across al matching records), 'all' (include
+ all elements), or 'no' (don't merge; this is the
+ default);
+ </para>
+ </listentry>
+ </varlistentry>
+ </variablelist> <!-- attributes to metadata -->
+ </varlistentry>
+ </variablelist> <!-- Data elements in service directive -->
+ </listitem>
+ </varlistentry>
+ </variablelist> <!-- Data elements in server directive -->
+ </refsect2>
+
+ <refsect2 id="config-queryprofile">
+ <para>
+ At the moment, this directive is ignored; there is one global
+ CCL-mapping file which governs the mapping of queries to Z39.50
+ type-1. This file is located in etc/default.bib. This will change
+ shortly.
+ </para>
+ </refsect2>
+
+ <refsect2 id="config-retrievalprofile">
+ <para>
+ Note: In the present version, there is a single retrieval
+ profile. However, in a future release, it will be possible to
+ associate unique retrieval profiles with different targets, or to
+ generate retrieval profiles using XSLT from the ZeeRex description of
+ a target.
+ </para>
+
+ <para>
+ The following data elements are recognized for the retrievalprofile
+ directive:
+ </para>
+
+ <variablelist>
+ <varlistentry><term>requestsyntax</term>
+ <listitem>
+ <para>
+ This element specifies the request syntax to be used in queries. It only
+ makes sense for Z39.50-type targets.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry><term>nativesyntax</term>
+ <listitem>
+ <para>
+ This element specifies the native syntax and encoding of the
+ result records. The default is XML. The following attributes
+ are defined:
+ </para>
+ <variablelist>
+ <varlistentry><term>name</term>
+ <listitem>
+ <para>
+ The name of the syntax. Currently recognized values are
+ 'iso2709' (MARC), and 'xml'.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry><term>format</term>
+ <listitem>
+ <para>
+ The format, or schema, to be expected. Default is
+ 'marc21'.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry><term>encoding</term>
+ <listitem>
+ <para>
+ The encoding of the response record. Typical values for
+ MARC records are 'marc8' (general MARC-8), 'marc8s'
+ (MARC-8, but maps to precomposed UTF-8 characters, more
+ suitable for use in web browsers), 'latin1'.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry><term>mapto</term>
+ <listitem>
+ <para>
+ Specifies the flavor of MARCXML to map results to.
+ Default is 'marcxml'. 'marcxchange' is also possible, and
+ useful for Danish DANMARC records.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist> <!-- parameters to nativesyntax directive -->
+ </listitem>
+ </varlistentry>
+ </variablelist> <!-- sub-elements in retrievalprofile -->
+ </refsect2>
+
+ </refsect1>
<refsect1><title>OPTIONS</title>
<para></para>
<!ENTITY % common SYSTEM "common/common.ent">
%common;
]>
-<!-- $Id: pazpar2_protocol.xml,v 1.2 2007-01-12 15:21:04 adam Exp $ -->
+<!-- $Id: pazpar2_protocol.xml,v 1.3 2007-01-19 18:28:08 quinn Exp $ -->
<refentry id="pazpar2_protocol">
<refentryinfo>
<productname>Pazpar2</productname>
<refsect1><title>DESCRIPTION</title>
<para>
Webservice requests are any that refer to filename "search.pz2". Arguments
- are GET-style parameters. Argument 'command' is required and specifies
- command. Any request not recognized as a webservice request as described,
- are forwarded to the HTTP server specified in configuration.
- This way, the webserver can host the user interface (itself dynamic
- or static HTML), and AJAX-style calls can be used from JS to interact
- with the search logic.
+ are GET-style parameters. Argument 'command' is always required and specifies
+ the operation to perform. Any request not recognized as a webservice
+ request is forwarded to the HTTP server specified in the configuration
+ using the proxy setting.
+ This way, a regular webserver can host the user interface (itself dynamic
+ or static HTML), and AJAX-style calls can be used from JS (or any other client-based
+ scripting environment) to interact with the search logic in pazpar2.
</para>
<para>
Each command is described in sub sections to follow.
<para>
Example:
<screen><![CDATA[
-search.pz2?session=2044502273&command=search&query=computer
+search.pz2?session=2044502273&command=search&query=computer+science
]]>
</screen>
Response:
<refsect2 id="command-stat">
<title>stat</title>
<para>
- Provides status of ongoing search. Parameters:
+ Provides status information about an ongoing search. Parameters:
<variablelist>
<varlistentry>
<stat>
<activeclients>3</activeclients>
<hits>7</hits> -- Total hitcount
- <records>7</records> -- Total number of records fetched
+ <records>7</records> -- Total number of records fetched in last query
<clients>1</clients> -- Total number of associated clients
<unconnected>0</unconnected> -- Number of disconnected clients
<connecting>0</connecting> -- Number of clients in connecting state
<term>start</term>
<listitem>
<para>First record to show - 0-indexed.</para>
- </listitem>
+ </listitem
</varlistentry>
<varlistentry>
<term>block</term>
<listitem>
<para>
- If block is set, the command will hang until there are records ready
+ If block is set to 1, the command will hang until there are records ready
to display. Use this to show first records rapidly without
requiring rapid polling.
</para>
</listitem>
</varlistentry>
+ <varlistentry>
+ <term>sort</term>
+ <listitem>
+ <para>
+ Specifies sort criteria. The argument is a comma-separated list
+ (no whitespace allowed) of sort fields, with the highest-priority
+ field first. A sort field may be followed by a colon followed by
+ the number '0' or '1', indicating whether results should be sorted in
+ increasing or decreasing order according to that field. 0==Decreasing is
+ the default.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</para>
<para>
Example:
<screen><![CDATA[
-search.pz2?session=2044502273&command=show&start=0&num=2
+search.pz2?session=2044502273&command=show&start=0&num=2&sort=title:1
]]></screen>
Output:
<screen><![CDATA[
<show>
<status>OK</status>
- <activeclients>3</activeclients>
- <merged>6</merged>
- <total>7</total>
- <start>0</start>
- <num>2</num>
+ <activeclients>3</activeclients> -- How many clients are still working
+ <merged>6</merged> -- Number of merged records
+ <total>7</total> -- Total of all hitcounts
+ <start>0</start> -- The start number you requested
+ <num>2</num> -- Number of records retrieved
<hit>
<md-title>How to program a computer, by Jack Collins</md-title>
- <count>2</count> <!-- Number of merged records -->
- <recid>6</recid>
+ <count>2</count> -- Number of merged records
+ <recid>6</recid> -- Record ID for this record
</hit>
<hit>
<md-title>
<variablelist>
<varlistentry>
+ <term>session</term>
+ <listitem>
+ <para>
+ Session ID
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
<term>id</term>
<listitem>
<para>
<screen><![CDATA[
<term>
<name>library2.mcmaster.ca</name>
- <frequency>11734</frequency>
- <state>Client_Idle</state>
- <diagnostic>0</diagnostic>
+ <frequency>11734</frequency> -- Number of hits
+ <state>Client_Idle</state> -- See the description of 'bytarget' below
+ <diagnostic>0</diagnostic> -- Z39.50 diagnostic codes
</term>
]]></screen>
</para>
</refsect2>
+
+ <refsect2 id="command-bytarget">
+ <title>bytarget</title>
+ <para>
+ Returns information about the status of each active client. Parameters:
+
+ <variablelist>
+ <varlistentry>
+ <term>session</term>
+ <listitem>
+ <para>
+ Session Id.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ <para>
+ Example:
+ <screen><![CDATA[
+search.pz2?session=605047297&command=record&id=3
+]]></screen>
+
+ Example output:
+
+ <screen><![CDATA[
+<bytarget>
+ <status>OK</status>
+ <target>
+ <id>z3950.loc.gov/voyager/</id>
+ <hits>10000</hits>
+ <diagnostic>0</diagnostic>
+ <records>65</records>
+ <state>Client_Presenting</state>
+ </target>
+ <!-- ... more target nodes below as necessary -->
+</bytarget>
+ <screen><![CDATA[
+]]></screen>
+
+ The following client states are defined: Client_Connecting,
+ Client_Connected, Client_Idle, Client_Initializing, Client_Searching,
+ Client_Searching, Client_Presenting, Client_Error, Client_Failed,
+ Client_Disconnected, Client_Stopped.
+ </para>
+ </refsect2>
+
</refsect1>
</refentry>