X-Git-Url: http://jsfdemo.indexdata.com/?a=blobdiff_plain;f=heikki%2FREADME-HEIKKI;h=0b547a1f93c0beaaaeafb416294f8c12a55bccb2;hb=refs%2Fheads%2Franking-h;hp=a6ca02d6e56851df64a805eccb53462ea5dd643f;hpb=b6b190610799a920163200fd5920406adcc3f6c0;p=pazpar2-moved-to-github.git

diff --git a/heikki/README-HEIKKI b/heikki/README-HEIKKI
index a6ca02d..0b547a1 100644
--- a/heikki/README-HEIKKI
+++ b/heikki/README-HEIKKI
@@ -38,3 +38,52 @@ Next: See if I can implement a round robin.
    - keep an array of structs with the pointer, and locate the client number that way
  - robin-score = pos * n_clients + client_num
 
+relevance_new_rec is called every time a new record pops up. One or more to count_word,
+exactly one to done_rec. That's where I can compare to the ranking of the previous
+record. struct_relevance is one structure I have for myself, global (for the user
+session), so I can keep my stuff in there, possibly an array of things for each target.
+
+I should also add stuff directly to the client, and to the record, as I need.
+
+Next: Plot the tf/idf scores against round-robin sorted order. Will be messy,
+but later when we get a target that returns sorted records, it will make sense.
+
+
+Wed 27-Nov
+Setting up multiple SOLR targets in the same pazpar2
+ - Add #999 to the z-urls, so pazpar2 won't merge them. Different number for each
+
+This URL shows the databases, with their numbers
+http://lui.indexdata.com/solr/select?q=database:*&facet=true&facet.method=fc&facet.field=author_exact&facet.field=subject_exact&facet.field=date&facet.field=medium_exact&facet.field=database&rows=0&facet.mincount=1
+
+Add this to the target defs
+<set name="pz:extra_args" value="fq=database:4902">
+
+After this, it should be possible to get records from different databases, some
+with many records, some with a few. This is a good testing ground for merging
+rankings! Test first with a round-robin, and plot the scores.
+
+Thu 28-Nov
+Ok, I can now merge a number of SOLR databases (harvest jobs), and plot their rankings
+as solr gives them, in the order of different merge strategies
+Next: Add the normalizing merge strategy. Then plot different strategies against different queries
+Write a conclusion, and consider this plotting job done
+
+
+Fri 13-Dec-2013
+Adam is adding a float type to pazpar2. I have made a prrof of concept of the normalizing
+by curve fitting. I think it is time to close this branch, and start (re)implementing
+things in the main branch. Keep the old branch around for reference!
+
+Need new config options:
+ - sort: native, native + position
+ - or per target: native score / fake score from position / use tf/idf
+ - per target: weight for combining rankings (cluster merge), so we can trust one
+   target more than others
+ - per target: boost rankings
+
+Start coding:
+ - in relevance-prepare-read, go through records, collect scores in arrays (per target),
+ - fit the curve, normalize the scores.
+ - cluster scoring
+