X-Git-Url: http://jsfdemo.indexdata.com/?a=blobdiff_plain;f=doc%2Fpazpar2_conf.xml;h=158b3beaade4a3e74645a51291cdb5f07b731f5b;hb=a58a0ed09301d1a773ad5489fba90b9dddfb1bfd;hp=8e6e6a5e7929c8830460ebfcf9933c88bb20845b;hpb=f660a2730edb626b909b8c4d0c8dfeb6b2590170;p=pazpar2-moved-to-github.git diff --git a/doc/pazpar2_conf.xml b/doc/pazpar2_conf.xml index 8e6e6a5..158b3be 100644 --- a/doc/pazpar2_conf.xml +++ b/doc/pazpar2_conf.xml @@ -5,10 +5,10 @@ %local; %entities; - - %common; + + %idcommon; ]> - + Pazpar2 @@ -101,17 +101,68 @@ - zproxy + icu_chain - If this item is given, pazpar2 will send all Z39.50 - packages through this Z39.50 proxy server. - At least one of the 'host' and 'post' attributes is required. - The 'host' attribute may contain both host name and port - number, seperated by a colon ':', or only the host name. - An empty 'host' attribute sets the Z39.50 host address - to 'localhost'. + Definition of ICU tokenization and normalization rules + are used if ICU support is compiled in. The 'id' + attribute is currently not used, and the 'locale' + attribute must be set to one of the locale strings + defined in ICU. The child elements listed below can be + in any order, except the 'index' element which logically + belongs to the end of the list. The stated tokenization, + normalization and charmapping instructions are performed + in order from top to bottom. + + casemap + + + The attribure 'rule' defines the direction of the + per-character casemapping, allowed values are "l" + (lower), "u" (upper), "t" (title). + + + + normalize + + + Normalization and transformation of tokens follows + the rules defined in the 'rule' attribute. For + possible values we refer to the extensive ICU + documentation found at the + ICU + transformation home page. Set filtering + principles are explained at the + ICU set and + filtering page. + + + + tokenize + + + Tokenization is the only rule in the ICU chain + which splits one token into multiple tokens. The + 'rule' attribute may have the following values: + "s" (sentence), "l" (line-break), "w" (word), and + "c" (character), the later probably not beeing + very useful in a runing pazpar2 installation. + + + + index + + + Finally the 'index' element instruction - without + any 'rule' attribute - is used to store the tokens + after chain processing in the relevance ranking + unit of Pazpar2. It will always be the last + instruction in the chain. + + + + @@ -144,10 +195,13 @@ This is the name of the data element. It is matched - against the 'type' attribute of the 'metadata' element + against the 'type' attribute of the + 'metadata' element in the normalized record. A warning is produced if - metdata elements with an unknown name are found in the - normalized record. This name is also used to represent + metdata elements with an unknown name are + found in the + normalized record. This name is also used to + represent data elements in the records returned by the webservice API, and to name sort lists and browse facets. @@ -194,11 +248,13 @@ rank - Specifies that this element is to be used to help rank + Specifies that this element is to be used to + help rank records against the user's query (when ranking is requested). The value is an integer, used as a multiplier against the basic TF*IDF score. A value of - 1 is the base, higher values give additional weight to + 1 is the base, higher values give additional + weight to elements of this type. The default is '0', which excludes this element from the rank calculation. @@ -212,7 +268,8 @@ termlist, or browse facet. Values are tabulated from incoming records, and a highscore of values (with their associated frequency) is made available to the - client through the webservice API. The possible values + client through the webservice API. + The possible values are 'yes' and 'no' (default). @@ -254,9 +311,16 @@ - - - + + @@ -277,10 +341,10 @@ TARGET SETTINGS Pazpar2 features a cunning scheme by which you can associate various - kinds of attributes, or settings with search targets. This is done - through XML files; each file can associate one or more settings - with one or more targets. The file format is generic in nature, - designed to support a wide range of application requirements. The + kinds of attributes, or settings with search targets. This can be done + through XML files which are read at startup; each file can associate + one or more settings with one or more targets. The file format is generic + in nature, designed to support a wide range of application requirements. The settings can be purely technical things, like, how to perform a title search against a given target, or it can associate arbitrary name=value pairs with groups of targets -- for instance, if you would like to @@ -306,7 +370,73 @@ overriden, to allow use of pazpar2 in a consortial or multi-library environment, where different end-users may need to be represented to some search targets in different ways. This, again, can be managed - using an external database or other lookup mechanism. + using an external database or other lookup mechanism. Setting overrides + can be performed either using the 'init' or the 'settings' webservice + command (see XXX ref to pazpar2 protocol). + + + + In fact, every setting that applies to a database (except pz:id, which + can only be used for filtering targets to use for a search) can be overriden + on a per-session basis. This allows the client to override specific CCL fields + for searching, etc., to meet the needs of a session or user. + + + + Finally, as an extreme case of this, the webservice client can + introduce entirely new targets, on the fly, as part of the init or + settings command. This is useful if you desire to manage information + about your search targets in a separate application such as a database. + You do not need any static settings file whatsoever to run pazpar2 -- as + long as the webservice client is prepared to supply the necessary + information at the beginning of every session. + + + + NOTE: The following discussion of practical issues related to session and settings + management are cast in terms of a user interface based on Ajax/Javascript + technology. It would apply equally well to many other kinds of browser-based logic. + + + + Typically, a Javascript client is not allowed to directly alter the parameters + of a session. There are two reasons for this. One has to do with access + to information; typically, information about a user will be stored in a + system on the server side, or it will be accessible in some way from the server. + However, since the Javascript client cannot be entirely trusted (some hostile + agent might in fact 'pretend' to be a regular ws client), it is more robust + to control session sesttings from scripting that you run as part of your + webserver. Typically, this can be handled during the session initialization, + as follows: + + + + Step 1: The Javascript client loads, and asks the webserver for a new pazpar2 + session ID. This can be done using a Javascript call, for instance. Note that + it is possible to submit Ajax HTTPXmlRequest calls either to pazpar2 or to the + webserver that pazpar2 is proxying for. See (XXX Insert link to pazpar2 protocol). + + + + Step 2: Code on the webserver authenticates the user, by database lookup, + LDAP access, NCIP, etc. Determines which resources the user has access to, + and any user-specific parameters that are to be applied during this session. + + + + Step 3: The webserver initializes a new pazpar2 settings, and sets user-specific + parameters as necessary, using the init webservice command. A new session ID is + returned. + + + + Step 4: The webserver returns this session ID to the Javascript client, which then + uses the session ID to submit searches, show results, etc. + + + + Step 5: When the Javascript client ceases to use the session, pazpar2 destroys + any session-specific information. SETTINGS FILE FORMAT @@ -407,7 +537,7 @@ - + @@ -502,7 +632,7 @@ The element set name to be used when retrieving records from a - server. + server (not yet implemented). @@ -566,7 +696,7 @@ Controls the maximum number of records to be retrieved from a - server. The default is 100. + server. The default is 100 (not yet implemented). @@ -581,6 +711,16 @@ + + pz:zproxy + + + The 'pz:zproxy' setting has the value syntax + 'host.internet.adress:port', it is used to tunnel Z39.50 + requests through the named Z39.50 proxy. + + +