X-Git-Url: http://jsfdemo.indexdata.com/?a=blobdiff_plain;f=doc%2Fbook.xml;h=0b861da498d4b1f388290d3f51d35f44e5fe772d;hb=f5628ec48d43245fd435b9ef78b8a37bf1b42544;hp=9764aecedbfa4c8960da0c819e5a4815fe612a6f;hpb=a5ad4b6e9fc82d294b1903e8b8c9439e33cfffd4;p=metaproxy-moved-to-github.git diff --git a/doc/book.xml b/doc/book.xml index 9764aec..0b861da 100644 --- a/doc/book.xml +++ b/doc/book.xml @@ -1,4 +1,4 @@ - + Metaproxy - User's Guide and Reference @@ -74,7 +74,7 @@ Anything goes in! Anything goes out! - Cold bananas, fish, pyjamas, + Fish, bananas, cold pyjamas, Mutton, beef and trout! - attributed to Cole Porter. @@ -128,7 +128,7 @@ Installation - Metaproxy depends on the folloing tools/libraries: + Metaproxy depends on the following tools/libraries: YAZ++ @@ -141,14 +141,16 @@ This is an XSLT processor - based on Libxml2. Both Libxml2 and - Libxslt must be installed with the development components. + Libxslt must be installed with the development components + (header files, etc.) as well as the run-time libraries. Boost - The popular C++ library. + The popular C++ library. Initial versions of Metaproxy + was built with 1.33.0. Version 1.33.1 works too. @@ -248,9 +250,92 @@
Installation on Windows - ### To be written + Compilation of Metaproxy can be done using + Microsoft Visual Studio. + We know Version 2003 works. We expect Version 2005 to + work as well. +
+ Boost + + Get Boost from its home page. + You also need Boost Jam (an alternative to make). + That's also available from this + home page. The files download are called something like: + boost_1_33-1.exe + and + boost-jam-3.1.12-1-ntx86.zip. + Unpack Boost Jam first. Put bjam.exe + in your system path. Make a command prompt and ensure + it can be found automatically. If not check the PATH. + The Boost .exe is a self-extracting exe with + complete source for Boost. Compile that source with + Boost Jam (An alternative to Make). + The compilation takes a while. + By default, the Boost build process puts the resulting + libraries + header files in + \boost\lib, \boost\include. + + + For more informatation about installing Boost refer to the + getting started + pages. +
+ +
+ Libxslt + + Libxslt can be downloaded + for Windows from + here. + + + Libxslt has other dependencies, but thes can all be downloaded + from the same site. Get the following: + iconv, zlib, libxml2, libxslt. + +
+ +
+ YAZ + + YAZ can be downloaded + for Windows from + here. + +
+ +
+ YAZ++ + + Get YAZ++ as well. + Version 1.0 or later is required. For now get it from + Index Data's + Snapshot area. + + + YAZ++ includes NMAKE makefiles, similar to those found in the + YAZ package. + +
+ +
+ Metaproxy + + Metaproxy is shipped with NMAKE makfiles as well - similar + to those found in the YAZ++/YAZ packages. Adjust this Makefile + to point to the proper locations of Boost, Libxslt, Libxml2, + zlib, iconv, yaz and yazpp. + + + After succesful compilation you'll find + metaproxy.exe in the + bin directory. + +
+ +
@@ -493,7 +578,7 @@ <literal>multi</literal> (mp::filter::Multi) - Performs multicast searching. + Performs multi-database searching. See the extended discussion of virtual databases and multi-database searching below. @@ -740,12 +825,11 @@ file (included in the distribution as metaproxy/etc/config0.xml). This file defines a very simple configuration that simply proxies - to whatever backend server the client requests, but logs each + to whatever back-end server the client requests, but logs each request and response. This can be useful for debugging complex client-server dialogues. - + @@ -780,7 +864,7 @@ a log filter that emits a message for each request; they are then fed into a z3950_client filter, which forwards the requests to the client-specified - backend Z39.509 server. When the response arrives, it is handed + back-end Z39.509 server. When the response arrives, it is handed back to the log filter, which emits another message; and then to the front-end filter, which returns the response to the client. @@ -796,10 +880,222 @@
Introductory notes + + Two of Metaproxy's filters are concerned with multiple-database + operations. Of these, virt_db can work alone + to control the routing of searches to one of a number of servers, + while multi can work together with + virt_db to perform multi-database searching, merging + the results into a unified result-set - ``metasearch in a box''. + + + The interaction between + these two filters is necessarily complex: it reflects the real, + irreducible complexity of multi-database searching in a protocol such + as Z39.50 that separates initialisation from searching, and in + which the database to be searched is not known at initialisation + time. + + + It's possible to use these filters without understanding the + details of their functioning and the interaction between them; the + next two sections of this chapter are ``HOWTO'' guides for doing + just that. However, debugging complex configurations will require + a deeper understanding, which the last two sections of this + chapters attempt to provide. + +
+ + +
+ Virtual databases with the <literal>virt_db</literal> filter + + Working alone, the purpose of the + virt_db + filter is to route search requests to one of a selection of + back-end databases. In this way, a single Z39.50 endpoint + (running Metaproxy) can provide access to several different + underlying services, including those that would otherwise be + inaccessible due to firewalls. In many useful configurations, the + back-end databases are local to the Metaproxy installation, but + the software does not enforce this, and any valid Z39.50 servers + may be used as back-ends. + + + For example, a virt_db + filter could be set up so that searches in the virtual database + ``lc'' are forwarded to the Library of Congress bibliographic + catalogue server, and searches in the virtual database ``marc'' + are forwarded to the toy database of MARC records that Index Data + hosts for testing purposes. A virt_db + configuration to make this switch would look like this: + + + + lc + z3950.loc.gov:7090/voyager + + + marc + indexdata.dk/marc + +]]> + + As well as being useful in it own right, this filter also provides + the foundation for multi-database searching. + +
+ + +
+ Multi-database search with the <literal>multi</literal> filter + + To arrange for Metaproxy to broadcast searches to multiple back-end + servers, the configuration needs to include two components: a + virt_db + filter that specifies multiple + <target> + elements, and a subsequent + multi + filter. Here, for example, is a complete configuration that + broadcasts searches to both the Library of Congress catalogue and + Index Data's tiny testing database of MARC records: + + + + + + + + 10 + @:9000 + + + + lc + z3950.loc.gov:7090/voyager + + + marc + indexdata.dk/marc + + + all + z3950.loc.gov:7090/voyager + indexdata.dk/marc + + + + + 30 + + + +]]> + + (Using a + virt_db + filter that specifies multiple + <target> + elements but without a subsequent + multi + filter yields surprising and undesirable results, as will be + described below. Don't do that.) + + + Metaproxy can be invoked with this configuration as follows: + + ../src/metaproxy --config config-simple-multi.xml + + And thereafter, Z39.50 clients can connect to the running server + (on port 9000, as specified in the configuration) and search in + any of the databases + lc (the Library of Congress catalogue), + marc (Index Data's test database of MARC records) + or + all (both of these). As an example, a session + using the YAZ command-line client yaz-client is + here included (edited for brevity and clarity): + + base lc +Z> find computer +Search was a success. +Number of hits: 10000, setno 1 +Elapsed: 5.521070 +Z> base marc +Z> find computer +Search was a success. +Number of hits: 10, setno 3 +Elapsed: 0.060187 +Z> base all +Z> find computer +Search was a success. +Number of hits: 10010, setno 4 +Elapsed: 2.237648 +Z> show 1 +[marc]Record type: USmarc +001 11224466 +003 DLC +005 00000000000000.0 +008 910710c19910701nju 00010 eng +010 $a 11224466 +040 $a DLC $c DLC +050 00 $a 123-xyz +100 10 $a Jack Collins +245 10 $a How to program a computer +260 1 $a Penguin +263 $a 8710 +300 $a p. cm. +Elapsed: 0.119612 +Z> show 2 +[VOYAGER]Record type: USmarc +001 13339105 +005 20041229102447.0 +008 030910s2004 caua 000 0 eng +035 $a (DLC) 2003112666 +906 $a 7 $b cbc $c orignew $d 4 $e epcn $f 20 $g y-gencatlg +925 0 $a acquire $b 1 shelf copy $x policy default +955 $a pc10 2003-09-10 $a pv12 2004-06-23 to SSCD; $h sj05 2004-11-30 $e sj05 2004-11-30 to Shelf. +010 $a 2003112666 +020 $a 0761542892 +040 $a DLC $c DLC $d DLC +050 00 $a MLCM 2004/03312 (G) +245 10 $a 007, everything or nothing : $b Prima's official strategy guide / $c created by Kaizen Media Group. +246 3 $a Double-O-seven, everything or nothing +246 30 $a Prima's official strategy guide +260 $a Roseville, CA : $b Prima Games, $c c2004. +300 $a 161 p. : $b col. ill. ; $c 28 cm. +500 $a "Platforms: Nintendo GameCube, Macintosh, PC, PlayStation 2 computer entertainment system, Xbox"--P. [4] of cover. +650 0 $a Video games. +710 2 $a Kaizen Media Group. +856 42 $3 Publisher description $u http://www.loc.gov/catdir/description/random052/2003112666.html +Elapsed: 0.150623 +Z> +]]> + + As can be seen, the first record in the result set is from the + Index Data test database, and the second from the Library of + Congress database. The result-set continues alternating records + round-robin style until the point where one of the databases' + records are exhausted. + + + This example uses only two back-end databases; more may be used. + There is no limitation imposed on the number of databases that may + be metasearched in this way: issues of resource usage and + administrative complexity dictate the practical limits. + +
+ + +
+ What's going on? Lark's vomit - This chapter goes into a level of technical detail that is + This section goes into a level of technical detail that is probably not necessary in order to configure and use Metaproxy. It is provided only for those who like to know how things work. You should feel free to skip on to the next section if this one @@ -807,19 +1103,6 @@ - Two of Metaproxy's filters are concerned with multiple-database - operations. Of these, virt_db can work alone - to control the routing of searches to one of a number of servers, - while multi can work with the output of - virt_db to perform multicast searching, merging - the results into a unified result-set. The interaction between - these two filters is necessarily complex: it reflecting the real, - irreducible complexity of multicast searching in a protocol such - as Z39.50 that separates initialisation from searching, and in - which the database to be searched is not known at initialisation - time. - - Hold on tight - this may get a little hairy. @@ -843,25 +1126,8 @@ The role of the virt_db filter is to rewrite this otherInfo packet dependent on the virtual database that the - client wants to search. For example, a virt_db - filter could be set up so that searches in the virtual database - ``lc'' are forwarded to the Library of Congress server, and - searches in the virtual database ``id'' are forwarded to the toy - GILS database that Index Data hosts for testing purposes. A - virt_db configuration to make this switch would - look like this: + client wants to search. - - - lc - z3950.loc.gov:7090/Voyager - - - id - indexdata.dk/gils - - ]]> When Metaproxy receives a Z39.50 Init request from a client, it doesn't immediately forward that request to the back-end server. @@ -882,7 +1148,7 @@ frontend_net filter. The virt_db filter knows nothing about it; in fact, because the Init request that is received from the client - doesn't get forwarded until a Search reqeust is received, the + doesn't get forwarded until a Search request is received, the virt_db filter (and the z3950_client filter behind it) doesn't even get invoked at Init time. The only thing that a @@ -890,6 +1156,42 @@ VAL_PROXY otherInfo in the requests that pass through it. + + ### Describe the use of multiple VAL_PROXY + otherInfos, added by virt_db and used by + multi. + +
+ + +
+ A picture is worth a thousand words (but only five hundred on 64-bit architectures) + + + + + + + + + + + + [Here there should be a diagram showing the progress of + packages through the filters during a simple virtual-database + search and a multi-database search, but is seems that your + toolchain has not been able to include the diagram in this + document. This is because of LaTeX suckage. Time to move to + OpenOffice. Yes, really.] + + + + +
@@ -1204,6 +1506,5 @@ sgml-parent-document: "main.xml" sgml-local-catalogs: nil sgml-namecase-general:t - nxml-child-indent: 1 End: -->