<chapter id="examples">
- <!-- $Id: examples.xml,v 1.2 2002-08-29 16:30:22 mike Exp $ -->
+ <!-- $Id: examples.xml,v 1.6 2002-09-20 09:58:04 mike Exp $ -->
<title>Example Configurations</title>
<sect1>
</para>
</sect1>
- <sect1>
- <title>First Example: Minimal Configuration</title>
+ <sect1 id="example1">
+ <title>Example 1: Minimal Configuration</title>
<para>
- This example shows how Zebra can be used, with absolutely minimal
- configuration, to index a body of XML documents, and search them
+ This example shows how Zebra can be used with absolutely minimal
+ configuration to index a body of XML documents, and search them
using XPath expressions to specify access points.
</para>
<para>
- Go to the
- <literal>zebra/examples/dinosauricon</literal>
- directory. There you will find two significant files:
+ Go to the <literal>zebra/examples/dinosauricon</literal> directory.
+ There you will find a <literal>records</literal> subdirectory,
+ which contains some raw XML data to be added to the database: in
+ this case, two files, <literal>genera.xml</literal> and
+ <literal>taxa.xml</literal>, which contain information about all
+ the known dinosaur genera as of August 2002.
+ </para>
+ <para>
+ Now we need to create the Zebra database, which we do with the
+ Zebra indexer, <literal>zebraidx</literal>. This program's
+ behaviour is driven by a configuration life, generally called
+ <literal>zebra.cfg</literal>, although this can be changed with the
+ <literal>-c</literal> option. For our purposes, we don't need any
+ special behaviour - we can use the defaults - so an empty
+ configuration will do just fine. We can either create an empty
+ <literal>zebra.cfg</literal> or specify the name of an existing
+ empty file using, for example, <literal>-c /dev/null</literal>.
+ </para>
+ <para>
+ In this case, we'll use an empty <literal>zebra.cfg</literal> so
+ we can add more configuration to it later.
</para>
-
- <itemizedlist>
- <listitem>
- <para>
- The <literal>records</literal> subdirectory, which contains the
- raw XML data to be added to the database: in this case, just one
- file, <literal>genera.xml</literal>, which contains information
- about all the known dinosaur genera as of October 2000.
- <!-- ### Get more recent data -->
- </para>
- </listitem>
-
- <listitem>
- <para>
- The master configuration file, <literal>zebra.cfg</literal>,
- which is as short and simple as it can be:
- <!-- ### Keep this up to date -->
- <screen>
- # $Header: /home/cvsroot/idis/doc/examples.xml,v 1.2 2002-08-29 16:30:22 mike Exp $
- # Bare-bones master configuration file for Zebra
- profilePath: .:../../tab:../../../yaz/tab
- </screen>
- Apart from the comments, which are ignored, all this specifies is
- that the server should recognise the attribute set described in
- the file called
- <literal>bib1.att</literal>.
- </para>
- <!-- ### What is an attribute set? -->
- </listitem>
-
-<!--
- <listitem>
- <para>
- The BIB-1 attribute set configuration file,
- <literal>bib1.att</literal>, which is also as short as possible:
- <screen>
- # $Header: /home/cvsroot/idis/doc/examples.xml,v 1.2 2002-08-29 16:30:22 mike Exp $
- # Bare-bones BIB-1 attribute set file for Zebra
- reference Bib-1
- </screen>
- Apart from the comments, all this specifies is that reference of
- the attribute set described by this file is
- <literal>Bib-1</literal>, a name recognised by the system as
- referring to a well-known opaque identifier that is transmitted
- by clients as part of their searches.
- ### Yeuch! Surely we can say that better!
- </para>
- <para>
- ### Can't we somehow say this trivial thing in the main
- configuration file?
- </para>
- </listitem>
--->
- </itemizedlist>
-
<para>
That's all you need for a minimal Zebra configuration. Now you can
roll the XML records into the database and build the indexes:
<screen>
zebraidx -t grs.sgml update records
</screen>
- <!-- ### What does "grs.sgml" actually mean? -->
- and start the server which, by default listens on port 9999:
+ (### What does "grs.sgml" actually mean?)
+ </para>
+ <para>
+ Now start the server. Like the indexer, its behaviour is
+ controlled by a configuration file, generally
+ <literal>zebra.cfg</literal>; and like the indexer, it works just
+ fine with an empty configuration.
<screen>
zebrasrv
</screen>
+ By default, the server listens on IP port number 9999, although
+ this can easily be changed.
</para>
<para>
Now you can use the Z39.50 client program of your choice to execute
<idzebra:size>359</idzebra:size><idzebra:localnumber>447</idzebra:localnumber><idzebra:filename>records/genera.xml</idzebra:filename></GENUS>
</screen>
</para>
+ <para>
+ Now wasn't that easy?
+ </para>
</sect1>
+ <sect1 id="example2">
+ <title>Example 2: Adding Some Configuration</title>
+
+ <para>
+ You may have noticed as <literal>zebraidx</literal> was building
+ the database that it issued several warnings, which we ignored at
+ the time:
+ <screen>
+zebraidx -t grs.sgml update records
+02:12:32-30/08: zebraidx(18151) [warn] default.idx [No such file or directory]
+02:12:32-30/08: zebraidx(18151) [warn] Couldn't open explain.abs [No such file or directory]
+02:12:32-30/08: zebraidx(18151) [warn] records/genera.xml:0 Couldn't open GENUS.abs [No such file or directory]
+02:12:32-30/08: zebraidx(18151) [warn] records/genera.xml:0 Unknown register type: 0
+02:12:32-30/08: zebraidx(18151) [warn] records/genera.xml:0 Unknown register type: w
+02:12:35-30/08: zebraidx(18151) [warn] records/taxa.xml:0 Couldn't open TAXON.abs [No such file or directory]
+ </screen>
+ And the server issued several more as the client connected to it,
+ then searched for and retrieved a record:
+ <screen>
+02:17:10-30/08: zebrasrv(18165) [warn] default.idx [No such file or directory]
+02:17:10-30/08: zebrasrv(18165) [warn] Couldn't open explain.abs [No such file or directory]
+02:17:57-30/08: zebrasrv(18165) [warn] Unknown register type: w
+02:18:42-30/08: zebrasrv(18165) [warn] Couldn't open GENUS.abs [No such file or directory]
+ </screen>
+ </para>
+ </sect1>
</chapter>
+<!--
+
+ <listitem>
+ <para>
+ The master configuration file, <literal>zebra.cfg</literal>,
+ which is as short and simple as it can be:
+ <screen>
+ # $Header: /home/cvsroot/idis/doc/examples.xml,v 1.6 2002-09-20 09:58:04 mike Exp $
+ # Bare-bones master configuration file for Zebra
+ profilePath: .:../../tab:../../../yaz/tab
+ </screen>
+ Apart from the comments, which are ignored, all this specifies is
+ that the server should recognise the attribute set described in
+ the file called
+ <literal>bib1.att</literal>.
+ ### What is an attribute set?
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ The BIB-1 attribute set configuration file,
+ <literal>bib1.att</literal>, which is also as short as possible:
+ <screen>
+ # $Header: /home/cvsroot/idis/doc/examples.xml,v 1.6 2002-09-20 09:58:04 mike Exp $
+ # Bare-bones BIB-1 attribute set file for Zebra
+ reference Bib-1
+ </screen>
+ Apart from the comments, all this specifies is that reference of
+ the attribute set described by this file is
+ <literal>Bib-1</literal>, a name recognised by the system as
+ referring to a well-known opaque identifier that is transmitted
+ by clients as part of their searches.
+ ### Yeuch! Surely we can say that better!
+ </para>
+ <para>
+ ### Can't we somehow say this trivial thing in the main
+ configuration file?
+ </para>
+ </listitem>
+-->
+
+<!--
+ The simplest hello-world example could go like this:
+
+ Index the document
+
+ <book>
+ <title>The art of motorcycle maintenance</title>
+ <subject scheme="Dewey">zen</subject>
+ </book>
+
+ And search it like
+
+ f @attr 1=/book/title motorcycle
+
+ f @attr 1=/book/subject[@scheme=Dewey] zen
+
+ If you suddenly decide you want broader interop, you can add
+ an abs file (more or less like this):
+
+ attset bib1.att
+ tagset tagsetg.tag
+
+ elm (2,1) title title
+ elm (2,21) subject subject
+-->
+
+<!--
+How to include images:
+
+ <mediaobject>
+ <imageobject>
+ <imagedata fileref="system.eps" format="eps">
+ </imageobject>
+ <imageobject>
+ <imagedata fileref="system.gif" format="gif">
+ </imageobject>
+ <textobject>
+ <phrase>The Multi-Lingual Search System Architecture</phrase>
+ </textobject>
+ <caption>
+ <para>
+ <emphasis role="strong">
+ The Multi-Lingual Search System Architecture.
+ </emphasis>
+ <para>
+ Network connections across local area networks are
+ represented by straight lines, and those over the
+ internet by jagged lines.
+ </caption>
+ </mediaobject>
+
+Whene the three <*object> thingies inside the top-level <mediaobject>
+are decreasingly preferred version to include depending on what the
+rendering engine can handle. I generated the EPS version of the image
+by exporting a line-drawing done in TGIF, then converted that to the
+GIF using a shell-script called "epstogif" which used an appallingly
+baroque sequence of conversions, which I would prefer not to pollute
+the Zebra build environment with:
+
+ #!/bin/sh
+
+ # Yes, what follows is stupidly convoluted, but I can't find a
+ # more straightforward path from the EPS generated by tgif's
+ # "Print" command into a browser-friendly format.
+
+ file=`echo "$1" | sed 's/\.eps//'`
+ ps2pdf "$1" "$file".pdf
+ pdftopbm "$file".pdf "$file"
+ pnmscale 0.50 < "$file"-000001.pbm | pnmcrop | ppmtogif
+ rm -f "$file".pdf "$file"-000001.pbm
+
+-->
+
<!-- Keep this comment at the end of the file
Local variables:
mode: sgml