X-Git-Url: http://jsfdemo.indexdata.com/?a=blobdiff_plain;f=doc%2Fbook.xml;h=4dcb2f47e3928c9950c1a8580472379f7ad63edf;hb=271eaaa60ec419d64669cf0e9b5753d05365b798;hp=915f69d69f61289420a3d2756791fa6790c5f1a2;hpb=4f8873c1f52ae189e60d4bc2f0ec30fa90f02ce5;p=metaproxy-moved-to-github.git diff --git a/doc/book.xml b/doc/book.xml index 915f69d..4dcb2f4 100644 --- a/doc/book.xml +++ b/doc/book.xml @@ -1,4 +1,4 @@ - + Metaproxy - User's Guide and Reference @@ -9,16 +9,20 @@ 2006 - Index Data + Index Data ApS Metaproxy is a universal router, proxy and encapsulated metasearcher for information retrieval protocols. It accepts, processes, interprets and redirects requests from IR clients using - standard protocols such as ANSI/NISO Z39.50 (and in the future SRU - and SRW), as well as functioning as a limited - HTTP server. Metaproxy is configured by an XML file which + standard protocols such as + ANSI/NISO Z39.50 + (and in the future SRU + and SRW), as + well as functioning as a limited + HTTP server. + Metaproxy is configured by an XML file which specifies how the software should function in terms of routes that the request packets can take through the proxy, each step on a route being an instantiation of a filter. Filters come in many @@ -33,6 +37,16 @@ should not at this stage redistribute the code without explicit written permission from the copyright holders, Index Data ApS. + + + + + + + + + + @@ -40,39 +54,55 @@ Introduction - - Metaproxy - is a standalone program that acts as a universal router, proxy and - encapsulated metasearcher for information retrieval protocols such - as Z39.50, and in the future SRU and SRW. To clients, it acts as a - server of these - protocols: it can be searched, records can be retrieved from it, - etc. To servers, it acts as a client: it searches in them, - retrieves records from them, etc. it satisfies its clients' - requests by transforming them, multiplexing them, forwarding them - on to zero or more servers, merging the results, transforming - them, and delivering them back to the client. In addition, it - acts as a simple HTTP server; support for further protocols can be - added in a modular fashion, through the creation of new filters. - - - Anything goes in! - Anything goes out! - Cold bananas, fish, pyjamas, - Mutton, beef and trout! + + Metaproxy + is a standalone program that acts as a universal router, proxy and + encapsulated metasearcher for information retrieval protocols such + as Z39.50, and in the future + SRU and SRW. + To clients, it acts as a server of these protocols: it can be searched, + records can be retrieved from it, etc. + To servers, it acts as a client: it searches in them, + retrieves records from them, etc. it satisfies its clients' + requests by transforming them, multiplexing them, forwarding them + on to zero or more servers, merging the results, transforming + them, and delivering them back to the client. In addition, it + acts as a simple HTTP server; support + for further protocols can be added in a modular fashion, through the + creation of new filters. + + + Anything goes in! + Anything goes out! + Cold bananas, fish, pyjamas, + Mutton, beef and trout! - attributed to Cole Porter. - - - Metaproxy is a more capable alternative to - YAZ Proxy, - being more powerful, flexible, configurable and extensible. Among - its many advantages over the older, more pedestrian work are - support for multiplexing (encapsulated metasearching), routing by - database name, authentication and authorisation and serving local - files via HTTP. Equally significant, its modular architecture - facilitites the creation of pluggable modules implementing further - functionality. - + + + Metaproxy is a more capable alternative to + YAZ Proxy, + being more powerful, flexible, configurable and extensible. Among + its many advantages over the older, more pedestrian work are + support for multiplexing (encapsulated metasearching), routing by + database name, authentication and authorisation and serving local + files via HTTP. Equally significant, its modular architecture + facilitites the creation of pluggable modules implementing further + functionality. + + + This manual will briefly describe Metaproxy's licensing situation + before giving an overview of its architecture, then discussing the + key concept of a filter in some depth and giving an overview of + the various filter types, then discussing the configuration file + format. After this come several optional chapters which may be + freely skipped: a detailed discussion of virtual databases and + multi-database searching, some notes on writing extensions + (additional filter types) and a high-level description of the + source code. Finally comes the reference guide, which contains + instructions for invoking the metaproxy + program, and detailed information on each type of filter, + including examples. + @@ -81,8 +111,8 @@ The Metaproxy Licence - No decision has yet been made on the terms under which - Metaproxy will be distributed. + No decision has yet been made on the terms under which + Metaproxy will be distributed. It is possible that, unlike other Index Data products, metaproxy may not be released under a @@ -95,8 +125,134 @@ + + Installation + + Metaproxy depends on the folloing tools/libraries: + + YAZ++ + + + This is a C++ library based on YAZ. + + + + Libxslt + + This is an XSLT processor - based on + Libxml2. Both Libxml2 and + Libxslt must be installed with the development components. + + + + Boost + + + The popular C++ library. + + + + + + + In order to compile Metaproxy a modern C++ compiler is + required. Boost, in particular, requires the C++ compiler + to facilitate the newest features. Refer to Boost + Compiler Status + for more information. + + + We have succesfully used Metaproxy with Boost using the compilers + GCC version 4.0 and + Microsoft Visual Studio 2003/2005. + +
+ Installation on Unix (from Source) + + Here is a quick step-by-step guide on how to compile all the + tools that Metaproxy uses. Only few systems have none of the required + tools binary packages. If, for example, Libxml2/libxslt are already + installed as development packages use those (and omit compilation). + + + + Libxml2/libxslt: + + + gunzip -c libxml2-version.tar.gz|tar xf - + cd libxml2-version + ./configure + make + su + make install + + + gunzip -c libxslt-version.tar.gz|tar xf - + cd libxslt-version + ./configure + make + su + make install + + + YAZ/YAZ++: + + + gunzip -c yaz-version.tar.gz|tar xf - + cd yaz-version + ./configure + make + su + make install + + + gunzip -c yazpp-version.tar.gz|tar xf - + cd yazpp-version + ./configure + make + su + make install + + + Boost: + + + gunzip -c boost-version.tar.gz|tar xf - + cd boost-version + ./configure + make + su + make install + + + Metaproxy: + + + gunzip -c metaproxy-version.tar.gz|tar xf - + cd metaproxy-version + ./configure + make + su + make install + +
+
+ Installation on Debian + + ### To be written + +
+ +
+ Installation on Windows + + ### To be written + +
+
+ The Metaproxy Architecture @@ -248,7 +404,7 @@ -
+
Overview of filter types We now briefly consider each of the types of filter supported by @@ -419,7 +575,7 @@
-
+
Future directions Some other filters that do not yet exist, but which would be @@ -521,7 +677,7 @@
-
+
Overview of XML structure All elements and attributes are in the namespace @@ -577,7 +733,7 @@
-
+
An example configuration The following is a small, but complete, Metaproxy configuration @@ -640,6 +796,16 @@
Introductory notes + + Lark's vomit + + This chapter goes into a level of technical detail that is + probably not necessary in order to configure and use Metaproxy. + It is provided only for those who like to know how things work. + You should feel free to skip on to the next section if this one + doesn't seem like fun. + + Two of Metaproxy's filters are concerned with multiple-database operations. Of these, virt_db can work alone @@ -647,13 +813,82 @@ while multi can work with the output of virt_db to perform multicast searching, merging the results into a unified result-set. The interaction between - these two filters is necessarily complex, reflecting the real - complexity of multicast searching in a protocol such as Z39.50 - that separates initialisation from searching, with the database to - search known only during the latter operation. + these two filters is necessarily complex: it reflecting the real, + irreducible complexity of multicast searching in a protocol such + as Z39.50 that separates initialisation from searching, and in + which the database to be searched is not known at initialisation + time. + + + Hold on tight - this may get a little hairy. + + + In the general course of things, a Z39.50 Init request may carry + with it an otherInfo packet of type VAL_PROXY, + whose value indicates the address of a Z39.50 server to which the + ultimate connection is to be made. (This otherInfo packet is + supported by YAZ-based Z39.50 clients and servers, but has not yet + been ratified by the Maintenance Agency and so is not widely used + in non-Index Data software. We're working on it.) + The VAL_PROXY packet functions + analogously to the absoluteURI-style Request-URI used with the GET + method when a web browser asks a proxy to forward its request: see + the + Request-URI + section of + the HTTP 1.1 specification. + + + The role of the virt_db filter is to rewrite + this otherInfo packet dependent on the virtual database that the + client wants to search. For example, a virt_db + filter could be set up so that searches in the virtual database + ``lc'' are forwarded to the Library of Congress server, and + searches in the virtual database ``id'' are forwarded to the toy + GILS database that Index Data hosts for testing purposes. A + virt_db configuration to make this switch would + look like this: + + + + lc + z3950.loc.gov:7090/Voyager + + + id + indexdata.dk/gils + + ]]> + + When Metaproxy receives a Z39.50 Init request from a client, it + doesn't immediately forward that request to the back-end server. + Why not? Because it doesn't know which + back-end server to forward it to until the client sends a search + request that specifies the database that it wants to search in. + Instead, it just treasures the Init request up in its heart; and, + later, the first time the client does a search on one of the + specified virtual databases, a connection is forged to the + appropriate server and the Init request is forwarded to it. If, + later in the session, the same client searches in a different + virtual database, then a connection is forged to the server that + hosts it, and the same cached Init request is forwarded there, + too. - ### Much, much more to say! + All of this clever Init-delaying is done by the + frontend_net filter. The + virt_db filter knows nothing about it; in + fact, because the Init request that is received from the client + doesn't get forwarded until a Search reqeust is received, the + virt_db filter (and the + z3950_client filter behind it) doesn't even get + invoked at Init time. The only thing that a + virt_db filter ever does is rewrite the + VAL_PROXY otherInfo in the requests that pass + through it.
@@ -710,7 +945,7 @@
-
+
Individual classes The classes making up the Metaproxy application are here listed by @@ -887,7 +1122,7 @@
-
+
Other Source Files In addition to the Metaproxy source files that define the classes @@ -954,21 +1189,8 @@ &manref;
- - - - + \ No newline at end of file