<chapter id="examples">
- <!-- $Id: examples.xml,v 1.2 2002-08-29 16:30:22 mike Exp $ -->
+ <!-- $Id: examples.xml,v 1.3 2002-08-30 01:17:10 mike Exp $ -->
<title>Example Configurations</title>
<sect1>
</sect1>
<sect1>
- <title>First Example: Minimal Configuration</title>
+ <title>Example 1: Minimal Configuration</title>
<para>
- This example shows how Zebra can be used, with absolutely minimal
- configuration, to index a body of XML documents, and search them
+ This example shows how Zebra can be used with absolutely minimal
+ configuration to index a body of XML documents, and search them
using XPath expressions to specify access points.
</para>
<para>
- Go to the
- <literal>zebra/examples/dinosauricon</literal>
- directory. There you will find two significant files:
+ Go to the <literal>zebra/examples/dinosauricon</literal> directory.
+ There you will find a <literal>records</literal> subdirectory,
+ which contains some raw XML data to be added to the database: in
+ this case, two files, <literal>genera.xml</literal> and
+ <literal>taxa.xml</literal>, which contain information about all
+ the known dinosaur genera as of August 2002.
+ </para>
+ <para>
+ Now we need to create the Zebra database, which we do with the
+ Zebra indexer, <literal>zebraidx</literal>. This program's
+ behaviour is driven by a configuration life, generally called
+ <literal>zebra.cfg</literal>, although this can be changed with the
+ <literal>-c</literal> option. For our purposes, we don't need any
+ special behaviour - we can use the defaults - so an empty
+ configuration will do just fine. We can either create an empty
+ <literal>zebra.cfg</literal> or specify the name of an existing
+ empty file using, for example, <literal>-c /dev/null</literal>.
+ </para>
+ <para>
+ In this case, we'll use an empty <literal>zebra.cfg</literal> so
+ we can add more configuration to it later.
</para>
-
- <itemizedlist>
- <listitem>
- <para>
- The <literal>records</literal> subdirectory, which contains the
- raw XML data to be added to the database: in this case, just one
- file, <literal>genera.xml</literal>, which contains information
- about all the known dinosaur genera as of October 2000.
- <!-- ### Get more recent data -->
- </para>
- </listitem>
-
- <listitem>
- <para>
- The master configuration file, <literal>zebra.cfg</literal>,
- which is as short and simple as it can be:
- <!-- ### Keep this up to date -->
- <screen>
- # $Header: /home/cvsroot/idis/doc/examples.xml,v 1.2 2002-08-29 16:30:22 mike Exp $
- # Bare-bones master configuration file for Zebra
- profilePath: .:../../tab:../../../yaz/tab
- </screen>
- Apart from the comments, which are ignored, all this specifies is
- that the server should recognise the attribute set described in
- the file called
- <literal>bib1.att</literal>.
- </para>
- <!-- ### What is an attribute set? -->
- </listitem>
-
-<!--
- <listitem>
- <para>
- The BIB-1 attribute set configuration file,
- <literal>bib1.att</literal>, which is also as short as possible:
- <screen>
- # $Header: /home/cvsroot/idis/doc/examples.xml,v 1.2 2002-08-29 16:30:22 mike Exp $
- # Bare-bones BIB-1 attribute set file for Zebra
- reference Bib-1
- </screen>
- Apart from the comments, all this specifies is that reference of
- the attribute set described by this file is
- <literal>Bib-1</literal>, a name recognised by the system as
- referring to a well-known opaque identifier that is transmitted
- by clients as part of their searches.
- ### Yeuch! Surely we can say that better!
- </para>
- <para>
- ### Can't we somehow say this trivial thing in the main
- configuration file?
- </para>
- </listitem>
--->
- </itemizedlist>
-
<para>
That's all you need for a minimal Zebra configuration. Now you can
roll the XML records into the database and build the indexes:
<screen>
zebraidx -t grs.sgml update records
</screen>
- <!-- ### What does "grs.sgml" actually mean? -->
- and start the server which, by default listens on port 9999:
+ (### What does "grs.sgml" actually mean?)
+ </para>
+ <para>
+ Now start the server. Like the indexer, its behaviour is
+ controlled by a configuration file, generally
+ <literal>zebra.cfg</literal>; and like the indexer, it works just
+ fine with an empty configuration.
<screen>
zebrasrv
</screen>
+ By default, the server listens on IP port number 9999, although
+ this can easily be changed.
</para>
<para>
Now you can use the Z39.50 client program of your choice to execute
<idzebra:size>359</idzebra:size><idzebra:localnumber>447</idzebra:localnumber><idzebra:filename>records/genera.xml</idzebra:filename></GENUS>
</screen>
</para>
+ <para>
+ Now wasn't that easy?
+ </para>
</sect1>
+ <sect1>
+ <title>Example 2: Adding Some Configuration</title>
+
+ <para>
+ You may have noticed as <literal>zebraidx</literal> was building
+ the database that it issued several warnings, which we ignored at
+ the time:
+ <screen>
+zebraidx -t grs.sgml update records
+02:12:32-30/08: zebraidx(18151) [warn] default.idx [No such file or directory]
+02:12:32-30/08: zebraidx(18151) [warn] Couldn't open explain.abs [No such file or directory]
+02:12:32-30/08: zebraidx(18151) [warn] records/genera.xml:0 Couldn't open GENUS.abs [No such file or directory]
+02:12:32-30/08: zebraidx(18151) [warn] records/genera.xml:0 Unknown register type: 0
+02:12:32-30/08: zebraidx(18151) [warn] records/genera.xml:0 Unknown register type: w
+02:12:35-30/08: zebraidx(18151) [warn] records/taxa.xml:0 Couldn't open TAXON.abs [No such file or directory]
+ </screen>
+ And the server issued several more as the client connected to it,
+ then searched for and retrieved a record:
+ <screen>
+02:17:10-30/08: zebrasrv(18165) [warn] default.idx [No such file or directory]
+02:17:10-30/08: zebrasrv(18165) [warn] Couldn't open explain.abs [No such file or directory]
+02:17:57-30/08: zebrasrv(18165) [warn] Unknown register type: w
+02:18:42-30/08: zebrasrv(18165) [warn] Couldn't open GENUS.abs [No such file or directory]
+ </screen>
+ </para>
+ </sect1>
</chapter>
+<!--
+
+ <listitem>
+ <para>
+ The master configuration file, <literal>zebra.cfg</literal>,
+ which is as short and simple as it can be:
+ <screen>
+ # $Header: /home/cvsroot/idis/doc/examples.xml,v 1.3 2002-08-30 01:17:10 mike Exp $
+ # Bare-bones master configuration file for Zebra
+ profilePath: .:../../tab:../../../yaz/tab
+ </screen>
+ Apart from the comments, which are ignored, all this specifies is
+ that the server should recognise the attribute set described in
+ the file called
+ <literal>bib1.att</literal>.
+ ### What is an attribute set?
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ The BIB-1 attribute set configuration file,
+ <literal>bib1.att</literal>, which is also as short as possible:
+ <screen>
+ # $Header: /home/cvsroot/idis/doc/examples.xml,v 1.3 2002-08-30 01:17:10 mike Exp $
+ # Bare-bones BIB-1 attribute set file for Zebra
+ reference Bib-1
+ </screen>
+ Apart from the comments, all this specifies is that reference of
+ the attribute set described by this file is
+ <literal>Bib-1</literal>, a name recognised by the system as
+ referring to a well-known opaque identifier that is transmitted
+ by clients as part of their searches.
+ ### Yeuch! Surely we can say that better!
+ </para>
+ <para>
+ ### Can't we somehow say this trivial thing in the main
+ configuration file?
+ </para>
+ </listitem>
+-->
+
<!-- Keep this comment at the end of the file
Local variables:
mode: sgml
<chapter id="introduction">
- <!-- $Id: introduction.xml,v 1.11 2002-08-29 14:05:11 mike Exp $ -->
+ <!-- $Id: introduction.xml,v 1.12 2002-08-30 01:17:10 mike Exp $ -->
<title>Introduction</title>
<sect1>
</para>
</sect2>
+<!--
+Envelope-to: zebra@miketaylor.org.uk
+From: Johannes Leveling <Johannes.Leveling@FernUni-Hagen.de>
+Content-Type: text/plain; charset=iso-8859-1
+Date: Thu, 29 Aug 2002 19:19:55 +0200
+To: zebra@miketaylor.org.uk
+Subject: [Zebralist] Looking for Deployment Stories
+In-Reply-To: <200208281002.LAA16526@seatbooker.net>
+X-Virus-Scanned: by AMaViS perl-11
+X-MIME-Autoconverted: from quoted-printable to 8bit by localhost.localdomain id g7TLWR905724
+
+Mike Taylor writes:
+ > People,
+ >
+ > In collaboration with Sebastian, Adam and Heikki, I am reworking some
+ > parts of the Zebra documentation in preparation for the forthcoming
+ > release. One area I am keen to expand on is (briefly) describing
+ > interesting applications of Zebra. If you've deployed it in a way
+ > that you consider interesting, I'd love to hear from you, however
+ > briefly. Think of this as a chance to get some free publicity for
+ > your application in the Zebra documentation.
+ >
+ > Replies off-list to <zebra@miketaylor.org.uk>, please.
+ >
+ > _/|_ _______________________________________________________________
+ > /o ) \/ Mike Taylor <mike@miketaylor.org.uk> www.miketaylor.org.uk
+ > )_v__/\ There are some good things you can never have too much of.
+ >
+ >
+ > _______________________________________________
+ > Zebralist mailing list
+ > Zebralist@indexdata.dk
+ > http://www.indexdata.dk/mailman/listinfo/zebralist
+ >
+Intersting?
+We have developed a natural language interface (NLI-Z39.50) for access
+to library databases at the Fernuniversität Hagen, Germany
+(http://ki212.fernuni-hagen.de/nli/NLI.html).
+To prepare formal information retrieval evaluation,
+we chose the Zebra server as the basis for
+evaluating retrieval effectiveness (measuring recall
+and precision for the GIRT database). The Zebra database
+consists of more than 76000 records in SGML format (bibliographic
+records from social science), which are mapped to MARC for presentation.
+Evaluation will take place as part of the TREC/CLEF campaign 2003
+(see http://clef.iei.pi.cnr.it or http://www4.eurospider.ch/CLEF/).
+
+
+Johannes Leveling Praktische Informatik VII/KI
+ FernUniversität Hagen
+
+Email : Johannes.Leveling@FernUni-Hagen.De
+Tel. : +49 2331 987-4525
+
+-->
+
<sect2>
<title>Various web indexes</title>
<para>