X-Git-Url: http://jsfdemo.indexdata.com/?a=blobdiff_plain;f=doc%2Fadministration.xml;h=eee315e1de6912fe8e955d22d33395c59c0bfbea;hb=a0e0e6201b34e05f715a65d19a9b5667b734a58f;hp=28e480247e9ad15cd0f2baf2799ec8e60a0a3631;hpb=d2e692248eac6469ef7a3a3f8044010cb5cc1da7;p=idzebra-moved-to-github.git diff --git a/doc/administration.xml b/doc/administration.xml index 28e4802..eee315e 100644 --- a/doc/administration.xml +++ b/doc/administration.xml @@ -1,9 +1,9 @@ - + Administrating Zebra @@ -276,7 +276,7 @@ - profilePath: path + profilePath: path Specifies a path of profile specification files. @@ -305,6 +305,19 @@ Specifies size of internal memory to use for the zebraidx program. The amount is given in megabytes - default is 4 (4 MB). + The more memory, the faster large updates happen, up to about + half the free memory available on the computer. + + + + + tempfiles: Yes/Auto/No + + + Tells zebra if it should use temporary files when indexing. The + default is Auto, in which case zebra uses temporary files only + if it would need more that memMax + megabytes of memory. This should be good for most uses. @@ -322,6 +335,62 @@ + + passwd: file + + + Specifies a file with description of user accounts for Zebra. + The format is similar to that known to Apache's htpasswd files + and UNIX' passwd files. Non-empty lines not beginning with + # are considered account lines. There is one account per-line. + A line consists of fields separate by a single colon character. + First field is username, second is password. + + + + + + passwd.c: file + + + Specifies a file with description of user accounts for Zebra. + File format is similar to that used by the passwd directive except + that the password are encrypted. Use Apache's htpasswd or similar + for maintenanace. + + + + + + perm.user: + permstring + + + Specifies permissions (priviledge) for a user that are allowed + to access Zebra via the passwd system. There are two kinds + of permissions currently: read (r) and write(w). By default + users not listed in a permission directive are given the read + priviledge. To specify permissions for a user with no + username, or Z39.50 anonymous style use + anonymous. The permstring consists of + a sequence of characters. Include character w + for write/update access, r for read access. + + + + + + dbaccess accessfile + + + Names a file which lists database subscriptions for individual users. + The access file should consists of lines of the form username: + dbnames, where dbnames is a list of database names, seprated by + '+'. No whitespace is allowed in the database list. + + + + @@ -384,7 +453,7 @@ - profilePath: /usr/local/yaz + profilePath: /usr/local/idzebra/tab attset: bib1.att simple.recordType: text simple.database: textbase @@ -600,7 +669,7 @@ - (see + (see for details of how the mapping between elements of your records and searchable attributes is established). @@ -774,7 +843,6 @@ register: /d1:500M - shadow: /scratch1:100M /scratch2:200M @@ -852,8 +920,112 @@ + + + + Static and Dynamic Ranking + + + Zebra uses internally inverted indexes to look up term occurencies + in documents. Multiple queries from different indexes can be + combined by the binary boolean operations AND, + OR and/or NOT (which + is in fact a binary AND NOT operation). + To ensure fast query execution + speed, all indexes have to be sorted in the same order. + + + The indexes are normally sorted according to document + ID in + ascending order, and any query which does not invoke a special + re-ranking function will therefore retrieve the result set in + document + ID + order. + + + If one defines the + + staticrank: 1 + + directive in the main core Zebra config file, the internal document + keys used for ordering are augmented by a preceeding integer, which + contains the static rank of a given document, and the index lists + are ordered + first by ascending static rank, + then by ascending document ID. + + + This implies that the default rank 0 + is the best rank at the + beginning of the list, and max int + is the worst static rank. + + + The experimental alvis filter provides a + directive to fetch static rank information out of the indexed XML + records, thus making all hit sets orderd + after ascending static + rank, and for those doc's which have the same static rank, ordered + after ascending doc ID. + See for the glory details. + + + If one wants to do a little fiddeling with the static rank order, + one has to invoke additional re-ranking/re-ordering using dynamic + reranking or score functions. These functions return positive + interger scores, where highest score is + best, which means that the + hit sets will be sorted according to + decending + scores (in contrary + to the index lists which are sorted according to + ascending rank number and document ID). + + + + Those are in the zebra config file enabled by a directive like (use + only one of these a time!): + + rank: rank-1 # default + rank: rank-static # dummy + rank: zvrank # TDF-IDF like + + Notice that the rank-1 and + zvrank do not use the static rank + information in the list keys, and will produce the same ordering + with our without static ranking enabled. + + + The dummy rank-static reranking/scoring + function returns just + score = max int - staticrank + in order to preserve the ordering of hit sets with and without it's + call. + Obviously, to combine static and dynamic ranking usefully, one wants + to make a new ranking + function, which is left + as an exercise for the reader. + + + + +