X-Git-Url: http://jsfdemo.indexdata.com/?a=blobdiff_plain;f=doc%2Fadministration.xml;h=eee315e1de6912fe8e955d22d33395c59c0bfbea;hb=a0e0e6201b34e05f715a65d19a9b5667b734a58f;hp=b9e88fe3abf46c9ed62ed372bdeaaf453f630c4f;hpb=dbfe7852bd2cc333d4ab7582bbffbcc7c05fb091;p=idzebra-moved-to-github.git diff --git a/doc/administration.xml b/doc/administration.xml index b9e88fe..eee315e 100644 --- a/doc/administration.xml +++ b/doc/administration.xml @@ -1,9 +1,9 @@ - + Administrating Zebra @@ -669,7 +669,7 @@ - (see + (see for details of how the mapping between elements of your records and searchable attributes is established). @@ -843,7 +843,6 @@ register: /d1:500M - shadow: /scratch1:100M /scratch2:200M @@ -921,8 +920,112 @@ + + + + Static and Dynamic Ranking + + + Zebra uses internally inverted indexes to look up term occurencies + in documents. Multiple queries from different indexes can be + combined by the binary boolean operations AND, + OR and/or NOT (which + is in fact a binary AND NOT operation). + To ensure fast query execution + speed, all indexes have to be sorted in the same order. + + + The indexes are normally sorted according to document + ID in + ascending order, and any query which does not invoke a special + re-ranking function will therefore retrieve the result set in + document + ID + order. + + + If one defines the + + staticrank: 1 + + directive in the main core Zebra config file, the internal document + keys used for ordering are augmented by a preceeding integer, which + contains the static rank of a given document, and the index lists + are ordered + first by ascending static rank, + then by ascending document ID. + + + This implies that the default rank 0 + is the best rank at the + beginning of the list, and max int + is the worst static rank. + + + The experimental alvis filter provides a + directive to fetch static rank information out of the indexed XML + records, thus making all hit sets orderd + after ascending static + rank, and for those doc's which have the same static rank, ordered + after ascending doc ID. + See for the glory details. + + + If one wants to do a little fiddeling with the static rank order, + one has to invoke additional re-ranking/re-ordering using dynamic + reranking or score functions. These functions return positive + interger scores, where highest score is + best, which means that the + hit sets will be sorted according to + decending + scores (in contrary + to the index lists which are sorted according to + ascending rank number and document ID). + + + + Those are in the zebra config file enabled by a directive like (use + only one of these a time!): + + rank: rank-1 # default + rank: rank-static # dummy + rank: zvrank # TDF-IDF like + + Notice that the rank-1 and + zvrank do not use the static rank + information in the list keys, and will produce the same ordering + with our without static ranking enabled. + + + The dummy rank-static reranking/scoring + function returns just + score = max int - staticrank + in order to preserve the ordering of hit sets with and without it's + call. + Obviously, to combine static and dynamic ranking usefully, one wants + to make a new ranking + function, which is left + as an exercise for the reader. + + + + +