Revise and expand examples.xml based on experiments with minimal

author Mike Taylor <mike@indexdata.com>

Fri, 30 Aug 2002 01:17:10 +0000 (01:17 +0000)

committer Mike Taylor <mike@indexdata.com>

Fri, 30 Aug 2002 01:17:10 +0000 (01:17 +0000)
author Mike Taylor <mike@indexdata.com>
Fri, 30 Aug 2002 01:17:10 +0000 (01:17 +0000)
committer Mike Taylor <mike@indexdata.com>
Fri, 30 Aug 2002 01:17:10 +0000 (01:17 +0000)
diff --git a/doc/Makefile.am b/doc/Makefile.am

index c59e78f..3950d94 100644 (file)
--- a/doc/Makefile.am
+++ b/doc/Makefile.am
@@ -1,4 +1,4 @@
-## $Id: Makefile.am,v 1.10 2002-06-02 19:30:07 adam Exp $
+## $Id: Makefile.am,v 1.11 2002-08-30 01:17:10 mike Exp $
  docdir=$(datadir)/doc/@PACKAGE@
  
  doc_DATA = zebra.html zebra.pdf
@@ -13,6 +13,7 @@ XMLFILES = \
   introduction.xml \
   installation.xml \
   quickstart.xml \
+ examples.xml \
   administration.xml \
   zebraidx.xml \
   server.xml \
@@ -53,4 +54,4 @@ dist-hook: zebra.html
  
  clean-data-hook:
         rm -f [0-9]* *.bak
-       
+
diff --git a/doc/examples.xml b/doc/examples.xml

index 3a49b6c..2735cb2 100644 (file)
--- a/doc/examples.xml
+++ b/doc/examples.xml
@@ -1,5 +1,5 @@
  <chapter id="examples">
- <!-- $Id: examples.xml,v 1.2 2002-08-29 16:30:22 mike Exp $ -->
+ <!-- $Id: examples.xml,v 1.3 2002-08-30 01:17:10 mike Exp $ -->
   <title>Example Configurations</title>
  
   <sect1>
@@ -44,84 +44,54 @@
   </sect1>
  
   <sect1>
-  <title>First Example: Minimal Configuration</title>
+  <title>Example 1: Minimal Configuration</title>
  
    <para>
-   This example shows how Zebra can be used, with absolutely minimal
-   configuration, to index a body of XML documents, and search them
+   This example shows how Zebra can be used with absolutely minimal
+   configuration to index a body of XML documents, and search them
     using XPath expressions to specify access points.
    </para>
    <para>
-   Go to the
-   <literal>zebra/examples/dinosauricon</literal>
-   directory.  There you will find two significant files:
+   Go to the <literal>zebra/examples/dinosauricon</literal> directory.
+   There you will find a <literal>records</literal> subdirectory,
+   which contains some raw XML data to be added to the database: in
+   this case, two files, <literal>genera.xml</literal> and
+   <literal>taxa.xml</literal>, which contain information about all
+   the known dinosaur genera as of August 2002.
+  </para>
+  <para>
+   Now we need to create the Zebra database, which we do with the
+   Zebra indexer, <literal>zebraidx</literal>.  This program's
+   behaviour is driven by a configuration life, generally called
+   <literal>zebra.cfg</literal>, although this can be changed with the
+   <literal>-c</literal> option.  For our purposes, we don't need any
+   special behaviour - we can use the defaults - so an empty
+   configuration will do just fine.  We can either create an empty
+   <literal>zebra.cfg</literal> or specify the name of an existing
+   empty file using, for example, <literal>-c /dev/null</literal>.
+  </para>
+  <para>
+   In this case, we'll use an empty <literal>zebra.cfg</literal> so
+   we can add more configuration to it later.
    </para>
-
-  <itemizedlist>
-   <listitem>
-    <para>
-     The <literal>records</literal> subdirectory, which contains the
-     raw XML data to be added to the database: in this case, just one
-     file, <literal>genera.xml</literal>, which contains information
-     about all the known dinosaur genera as of October 2000.
-     <!-- ### Get more recent data -->
-    </para>
-   </listitem>
-
-   <listitem>
-    <para>
-     The master configuration file, <literal>zebra.cfg</literal>,
-     which is as short and simple as it can be:
-     <!-- ### Keep this up to date -->
-     <screen>
-       # $Header: /home/cvsroot/idis/doc/examples.xml,v 1.2 2002-08-29 16:30:22 mike Exp $
-       # Bare-bones master configuration file for Zebra
-       profilePath: .:../../tab:../../../yaz/tab
-     </screen>
-     Apart from the comments, which are ignored, all this specifies is
-     that the server should recognise the attribute set described in
-     the file called
-     <literal>bib1.att</literal>.
-    </para>
-    <!-- ### What is an attribute set? -->
-   </listitem>
-
-<!--
-   <listitem>
-    <para>
-     The BIB-1 attribute set configuration file,
-     <literal>bib1.att</literal>, which is also as short as possible:
-     <screen>
-       # $Header: /home/cvsroot/idis/doc/examples.xml,v 1.2 2002-08-29 16:30:22 mike Exp $
-       # Bare-bones BIB-1 attribute set file for Zebra
-       reference Bib-1
-     </screen>
-     Apart from the comments, all this specifies is that reference of
-     the attribute set described by this file is
-     <literal>Bib-1</literal>, a name recognised by the system as
-     referring to a well-known opaque identifier that is transmitted
-     by clients as part of their searches.
-     ### Yeuch!  Surely we can say that better!
-    </para>
-    <para>
-     ### Can't we somehow say this trivial thing in the main
-     configuration file?
-    </para>
-   </listitem>
--->
-  </itemizedlist>
-
    <para>
     That's all you need for a minimal Zebra configuration.  Now you can
     roll the XML records into the database and build the indexes:
     <screen>
         zebraidx -t grs.sgml update records
     </screen>
-   <!-- ### What does "grs.sgml" actually mean? -->
-   and start the server which, by default listens on port 9999:
+   (### What does "grs.sgml" actually mean?)
+  </para>
+  <para>
+   Now start the server.  Like the indexer, its behaviour is
+   controlled by a configuration file, generally
+   <literal>zebra.cfg</literal>; and like the indexer, it works just
+   fine with an empty configuration.
     <screen>
         zebrasrv
     </screen>
+   By default, the server listens on IP port number 9999, although
+   this can easily be changed.
    </para>
    <para>
     Now you can use the Z39.50 client program of your choice to execute
@@ -151,10 +121,81 @@
         &lt;idzebra:size&gt;359&lt;/idzebra:size&gt;&lt;idzebra:localnumber&gt;447&lt;/idzebra:localnumber&gt;&lt;idzebra:filename&gt;records/genera.xml&lt;/idzebra:filename&gt;&lt;/GENUS&gt;
     </screen>
    </para>
+  <para>
+   Now wasn't that easy?
+  </para>
   </sect1>
  
+ <sect1>
+  <title>Example 2: Adding Some Configuration</title>
+
+  <para>
+   You may have noticed as <literal>zebraidx</literal> was building
+   the database that it issued several warnings, which we ignored at
+   the time:
+   <screen>
+zebraidx -t grs.sgml update records
+02:12:32-30/08: zebraidx(18151) [warn] default.idx [No such file or directory]
+02:12:32-30/08: zebraidx(18151) [warn] Couldn't open explain.abs [No such file or directory]
+02:12:32-30/08: zebraidx(18151) [warn] records/genera.xml:0 Couldn't open GENUS.abs [No such file or directory]
+02:12:32-30/08: zebraidx(18151) [warn] records/genera.xml:0 Unknown register type: 0
+02:12:32-30/08: zebraidx(18151) [warn] records/genera.xml:0 Unknown register type: w
+02:12:35-30/08: zebraidx(18151) [warn] records/taxa.xml:0 Couldn't open TAXON.abs [No such file or directory]
+   </screen>
+   And the server issued several more as the client connected to it,
+   then searched for and retrieved a record:
+   <screen>
+02:17:10-30/08: zebrasrv(18165) [warn] default.idx [No such file or directory]
+02:17:10-30/08: zebrasrv(18165) [warn] Couldn't open explain.abs [No such file or directory]
+02:17:57-30/08: zebrasrv(18165) [warn] Unknown register type: w
+02:18:42-30/08: zebrasrv(18165) [warn] Couldn't open GENUS.abs [No such file or directory]
+   </screen>
+  </para>
+ </sect1>
  </chapter>
  
+<!--
+
+   <listitem>
+    <para>
+     The master configuration file, <literal>zebra.cfg</literal>,
+     which is as short and simple as it can be:
+     <screen>
+       # $Header: /home/cvsroot/idis/doc/examples.xml,v 1.3 2002-08-30 01:17:10 mike Exp $
+       # Bare-bones master configuration file for Zebra
+       profilePath: .:../../tab:../../../yaz/tab
+     </screen>
+     Apart from the comments, which are ignored, all this specifies is
+     that the server should recognise the attribute set described in
+     the file called
+     <literal>bib1.att</literal>.
+     ### What is an attribute set?
+    </para>
+   </listitem>
+
+   <listitem>
+    <para>
+     The BIB-1 attribute set configuration file,
+     <literal>bib1.att</literal>, which is also as short as possible:
+     <screen>
+       # $Header: /home/cvsroot/idis/doc/examples.xml,v 1.3 2002-08-30 01:17:10 mike Exp $
+       # Bare-bones BIB-1 attribute set file for Zebra
+       reference Bib-1
+     </screen>
+     Apart from the comments, all this specifies is that reference of
+     the attribute set described by this file is
+     <literal>Bib-1</literal>, a name recognised by the system as
+     referring to a well-known opaque identifier that is transmitted
+     by clients as part of their searches.
+     ### Yeuch!  Surely we can say that better!
+    </para>
+    <para>
+     ### Can't we somehow say this trivial thing in the main
+     configuration file?
+    </para>
+   </listitem>
+-->
+
   <!-- Keep this comment at the end of the file
   Local variables:
   mode: sgml
diff --git a/doc/introduction.xml b/doc/introduction.xml

index 8976fa6..03401df 100644 (file)
--- a/doc/introduction.xml
+++ b/doc/introduction.xml
@@ -1,5 +1,5 @@
  <chapter id="introduction">
- <!-- $Id: introduction.xml,v 1.11 2002-08-29 14:05:11 mike Exp $ -->
+ <!-- $Id: introduction.xml,v 1.12 2002-08-30 01:17:10 mike Exp $ -->
   <title>Introduction</title>
   
   <sect1>
@@ -222,6 +222,62 @@
     </para>
    </sect2>
  
+<!--
+Envelope-to: zebra@miketaylor.org.uk
+From: Johannes Leveling <Johannes.Leveling@FernUni-Hagen.de>
+Content-Type: text/plain; charset=iso-8859-1
+Date: Thu, 29 Aug 2002 19:19:55 +0200
+To: zebra@miketaylor.org.uk
+Subject: [Zebralist] Looking for Deployment Stories
+In-Reply-To: <200208281002.LAA16526@seatbooker.net>
+X-Virus-Scanned: by AMaViS perl-11
+X-MIME-Autoconverted: from quoted-printable to 8bit by localhost.localdomain id g7TLWR905724
+
+Mike Taylor writes:
+ > People,
+ > 
+ > In collaboration with Sebastian, Adam and Heikki, I am reworking some
+ > parts of the Zebra documentation in preparation for the forthcoming
+ > release.  One area I am keen to expand on is (briefly) describing
+ > interesting applications of Zebra.  If you've deployed it in a way
+ > that you consider interesting, I'd love to hear from you, however
+ > briefly.  Think of this as a chance to get some free publicity for
+ > your application in the Zebra documentation.
+ > 
+ > Replies off-list to <zebra@miketaylor.org.uk>, please.
+ > 
+ >  _/|_        _______________________________________________________________
+ > /o ) \/  Mike Taylor   <mike@miketaylor.org.uk>   www.miketaylor.org.uk
+ > )_v__/\  There are some good things you can never have too much of.
+ > 
+ > 
+ > _______________________________________________
+ > Zebralist mailing list
+ > Zebralist@indexdata.dk
+ > http://www.indexdata.dk/mailman/listinfo/zebralist
+ > 
+Intersting?
+We have developed a natural language interface (NLI-Z39.50) for access
+to library databases at the Fernuniversität Hagen, Germany
+(http://ki212.fernuni-hagen.de/nli/NLI.html).
+To prepare formal information retrieval evaluation,
+we chose the Zebra server as the basis for
+evaluating retrieval effectiveness (measuring recall 
+and precision for the GIRT database). The Zebra database 
+consists of more than 76000 records in SGML format (bibliographic 
+records from social science), which are mapped to MARC for presentation. 
+Evaluation will take place as part of the TREC/CLEF campaign 2003 
+(see http://clef.iei.pi.cnr.it or http://www4.eurospider.ch/CLEF/).
+
+
+Johannes Leveling        Praktische Informatik VII/KI           
+                         FernUniversität Hagen
+
+Email : Johannes.Leveling@FernUni-Hagen.De  
+Tel.  : +49 2331 987-4525
+
+-->
+
    <sect2>
     <title>Various web indexes</title>
     <para>
diff --git a/doc/zebra.xml.in b/doc/zebra.xml.in

index e3032ce..d08de6f 100644 (file)
--- a/doc/zebra.xml.in
+++ b/doc/zebra.xml.in
@@ -1,10 +1,10 @@
  <?xml version="1.0" standalone="no"?>
  <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
                      "@DTD_DIR@/docbookx.dtd" [
-        <!ENTITY chap-examples SYSTEM "examples.xml">
          <!ENTITY chap-introduction SYSTEM "introduction.xml">
          <!ENTITY chap-installation SYSTEM "installation.xml">
          <!ENTITY chap-quickstart SYSTEM "quickstart.xml">
+        <!ENTITY chap-examples SYSTEM "examples.xml">
          <!ENTITY chap-administration SYSTEM "administration.xml">
          <!ENTITY chap-zebraidx SYSTEM "zebraidx.xml">
          <!ENTITY chap-server SYSTEM "server.xml">
@@ -12,7 +12,7 @@
          <!ENTITY app-license SYSTEM "license.xml">
          <!ENTITY app-indexdata SYSTEM "indexdata.xml">
  ]>
-<!-- $Id: zebra.xml.in,v 1.8 2002-08-29 16:30:22 mike Exp $ -->
+<!-- $Id: zebra.xml.in,v 1.9 2002-08-30 01:17:10 mike Exp $ -->
  <book id="zebra">
   <bookinfo>
    <title>Zebra - User's Guide and Reference</title>
@@ -50,10 +50,10 @@
    </abstract>
   </bookinfo>
   
-  &chap-examples;
    &chap-introduction;
    &chap-installation;
    &chap-quickstart;
+  &chap-examples;
    &chap-administration;
    &chap-zebraidx;
    &chap-server;
diff --git a/examples/dinosauricon/README b/examples/dinosauricon/README

index 269c057..5bd1a48 100644 (file)
--- a/examples/dinosauricon/README
+++ b/examples/dinosauricon/README
@@ -12,3 +12,6 @@ always get the up-to-date version from
  http://dinosauricon.com/data/
  
  (These were current at Thu Aug 29 17:11:27 BST 2002)
+
+Search in this database with XPath queries like:
+       @attr 1=/GENUS/MEANING bird
author	Mike Taylor <mike@indexdata.com>
	Fri, 30 Aug 2002 01:17:10 +0000 (01:17 +0000)
committer	Mike Taylor <mike@indexdata.com>
	Fri, 30 Aug 2002 01:17:10 +0000 (01:17 +0000)
doc/Makefile.am		patch \| blob \| history
doc/examples.xml		patch \| blob \| history
doc/introduction.xml		patch \| blob \| history
doc/zebra.xml.in		patch \| blob \| history
examples/dinosauricon/README		patch \| blob \| history