1 <chapter id="tutorial">
2 <!-- $Id: tutorial.xml,v 1.1 2008-02-01 13:54:39 marc Exp $ -->
3 <title>Tutorial</title>
6 <sect1 id="tutorial-oai">
7 <title>A first &acro.oai; indexing example</title>
10 In this section, we will test the system by indexing a small set of
11 sample &acro.oai; records that are included with the &zebra; distribution,
12 running a &zebra; server against the newly created database, and
13 searching the indexes with a client that connects to that server.
16 Go to the <literal>examples/oai-pmh</literal> subdirectory of the
17 distribution archive, or make a deep copy of the Debian installation
19 <literal>/usr/share/idzebra-2.0.-examples/oai-pmh</literal>.
20 An XML file containing multiple &acro.oai;
21 records is located in the sub
22 directory <literal>examples/oai-pmh/data</literal>. To index these, type:
24 zebraidx -c conf/zebra.cfg init
25 zebraidx -c conf/zebra.cfg update data/oai-caltech.xml
26 zebraidx -c conf/zebra.cfg commit
28 In case you have not installed zebra yet but have compiled the
29 binaries from this tarball, use the following command form:
31 ../../index/zebraidx -c conf/zebra.cfg this and that
36 In this command, the word <literal>update</literal> is followed
37 by the name of a directory: <literal>zebraidx</literal> updates all
38 files in the hierarchy rooted at that directory. The command option
39 <literal>-c conf/zebra.cfg</literal> points to the proper
44 You might ask yourself how &acro.xml; content is indexed using &acro.xslt;
45 stylesheets: to satisfy your curiosity, you might want to run the
46 indexing transformation on an example debugging &acro.oai; record.
48 xsltproc conf/oai2index.xsl data/debug-record.xml
50 Here you see the &acro.oai; record transformed into the indexing
51 &acro.xml; format. &zebra; is creating several inverted indexes,
52 and their name and type are clearly visible in the indexing
57 If your indexing command was successful, you are now ready to
58 fire up a server. To start a server on port 9999, type:
60 zebrasrv -c conf/zebra.cfg @:9999
65 The &zebra; index that you have just created has a single database
66 named <literal>Default</literal>.
67 The database contains several &acro.oai; records, and the server will
68 return records in the &acro.xml; format only. The indexing machine
69 di the splitting into individual records just behind the scenes.
73 To test the server, you can use any &acro.z3950; client.
74 For instance, you can use the demo command-line client that comes
75 with &yaz;; we start the SRU/SRW/Z39.50 server in PQF mode only:
79 yaz-client localhost:9999
84 When the client has connected, you can type:
98 Z39.50 presents using presentation stylesheets:
107 Z39.50 buildin Zebra presents (in this configuration only if
108 started without yaz-frontendserver):
110 Z> elements zebra::meta
113 Z> elements zebra::meta::sysno
120 Z> elements zebra::index
123 Z> elements zebra::snippet
126 Z> elements zebra::facet::any:w
129 Z> elements zebra::facet::any:w,dc_title:w
134 Z39.50 searches targeted at specific indexes
137 Z> find @attr 1=oai_identifier @attr 4=3 oai:caltechcstr.library.caltech.edu:4
140 Z> find @attr 1=oai_datestamp @attr 4=3 2001-04-20
143 Z> find @attr 1=oai_setspec @attr 4=3 7374617475733D756E707562
146 Z> find @attr 1=dc_title communication
149 Z> find @attr 1=dc_identifier @attr 4=3
150 http://resolver.caltech.edu/CaltechCSTR:1986.5228-tr-86
155 Notice that all indexes defined by 'type="0"' in the
156 indexing style sheet must be searched using the '@attr 4=3'
157 structure attribute instruction.
159 Notice also that searching and scan on indexes
160 'dc_contributor', 'dc_language', 'dc_rights', and 'dc_source'
161 fails, simply because none of the records in this example set
162 have these fields set, and consequently, these indexes are
171 yaz-client localhost:9999
174 Z> scan @attr 1=oai_identifier @attr 4=3 oai
175 Z> scan @attr 1=oai_datestamp @attr 4=3 1
176 Z> scan @attr 1=oai_setspec @attr 4=3 2000
178 Z> scan @attr 1=dc_title communication
179 Z> scan @attr 1=dc_identifier @attr 4=3 a
184 Z39.50 search using server-side CQL conversion:
192 Z> find creator = the
193 Z> find dc.creator = the
196 Z> find description < the
197 Z> find title le some
198 Z> find title ge some
201 Z> find identifier eq
202 "http://resolver.caltech.edu/CaltechCSTR:1978.2276-tr-78"
203 Z> find relation eq something
206 etc, etc. Notice that all indexes defined by 'type="0"' in the
207 indexing style sheet must be searched using the 'eq'
215 Z39.50 scan using server side CQL conversion:
217 Unfortunately, this will _never_ work as it is not supported by the
219 If you want to use scan using server side CQL conversion, you need to
220 make an SRW connection using yaz-client, or a
221 SRU connection using REST Web Services - any browser will do.
224 SRU Explain ZeeRex response:
226 http://localhost:9999/
227 http://localhost:9999/?version=1.1&operation=explain
230 SRU Search Retrieve records:
232 http://localhost:9999/?version=1.1&operation=searchRetrieve
235 http://localhost:9999/?version=1.1&operation=searchRetrieve
236 &query=date=1978-01-01
237 &startRecord=1&maximumRecords=1&recordSchema=dc
239 http://localhost:9999/?version=1.1&operation=searchRetrieve
242 http://localhost:9999/?version=1.1&operation=searchRetrieve
243 &query=description=the
248 http://localhost:9999/?version=1.1&operation=searchRetrieve
254 http://localhost:9999/?version=1.1&operation=scan&scanClause=title=a
255 http://localhost:9999/?version=1.1&operation=scan
256 &scanClause=identifier%20eq%20a
258 Notice: you need to use the 'eq' relation for all @attr 4=3 indexes
262 SRW explain with CQL index points:
264 Z> open http://localhost:9999
267 Notice: when opening a connection using the 'http.//' prefix, yaz-client
268 uses SRW SOAP connections, and 'form xml' and 'querytype cql' are
272 SRW search using implicit server side CQL:
274 Z> open http://localhost:9999
275 Z> find identifier eq
276 "http://resolver.caltech.edu/CaltechCSTR:1978.2276-tr-78"
277 Z> find description < the
280 In SRW connection mode, the follwing fails due to problem in yaz-client:
285 SRW scan using implicit server side CQL:
287 yaz-client http://localhost:9999
288 Z> scan title = communication
289 Z> scan identifier eq a
291 Notice: you need to use the 'eq' relation for all @attr 4=3 indexes
302 The default retrieval syntax for the client is &acro.usmarc;, and the
303 default element set is <literal>F</literal> (``full record''). To
304 try other formats and element sets for the same record, try:
320 <para>You may notice that more fields are returned when your
321 client requests &acro.sutrs;, &acro.grs1; or &acro.xml; records.
322 This is normal - not all of the GILS data elements have mappings in
323 the &acro.usmarc; record format.
328 If you've made it this far, you know that your installation is
329 working, but there's a certain amount of voodoo going on - for
330 example, the mysterious incantations in the
331 <literal>zebra.cfg</literal> file. In order to help us understand
332 these fully, the next chapter will work through a series of
333 increasingly complex example configurations.
341 <sect1 id="tutorial-oai-zebra">
342 <title>Requesting &acro.oai; records in &zebra; specific formats</title>
350 <!-- Keep this comment at the end of the file
355 sgml-minimize-attributes:nil
356 sgml-always-quote-attributes:t
359 sgml-parent-document: "zebra.xml"
360 sgml-local-catalogs: nil
361 sgml-namecase-general:t