Adam Dickmeiss [Wed, 22 Aug 2007 08:01:32 +0000 (08:01 +0000)]
Fixed bug in snippet support where first char was missed
Adam Dickmeiss [Tue, 21 Aug 2007 13:27:04 +0000 (13:27 +0000)]
Scan now returns displayTerm which is extract from original record.
Goodbye to @'s - for scan. Bug #1411.
Adam Dickmeiss [Tue, 21 Aug 2007 11:06:46 +0000 (11:06 +0000)]
Generic snippet support. Unlike previous versions of snippet
implementations for Zebra this is not tied to a specific filter. The
snippet(s) are returned as an XML record with one or more snippets
in it - for special element set name zebra::snippet.
Adam Dickmeiss [Tue, 21 Aug 2007 07:49:18 +0000 (07:49 +0000)]
Removed snippet code from alvis filter
Adam Dickmeiss [Mon, 13 Aug 2007 08:53:42 +0000 (08:53 +0000)]
Fixed magic NS for DOM XML indexing.
Adam Dickmeiss [Tue, 10 Jul 2007 09:43:33 +0000 (09:43 +0000)]
Removed / in doc install rule to avoid double-slash (cygwin)
Adam Dickmeiss [Tue, 10 Jul 2007 09:41:12 +0000 (09:41 +0000)]
Removed / in doc install rule to avoid double-slash (cygwin)
Adam Dickmeiss [Wed, 27 Jun 2007 22:17:20 +0000 (22:17 +0000)]
For data-1, do not chop text data in ISO2709 creation . The problem is
that in some cases the chop operation will remove essential content.
However, chop is needed in cases where input is XML/SGML. Therefore,
this operation performed in the data-1 map code instead and chop is
enabled by default. The chop can be disabled with 'nochop' in parameter
in map rule, e.g.
map title /(3,245)/(3,a) nochop
Adam Dickmeiss [Wed, 27 Jun 2007 22:04:45 +0000 (22:04 +0000)]
Added data1_chop_text which removes whitespace in cdata nodes
Adam Dickmeiss [Wed, 27 Jun 2007 13:46:43 +0000 (13:46 +0000)]
Remove comment about XML only being support in record update
Adam Dickmeiss [Tue, 26 Jun 2007 21:01:33 +0000 (21:01 +0000)]
Fixed typo
Marc Cromme [Fri, 22 Jun 2007 12:59:23 +0000 (12:59 +0000)]
fixed typo
Adam Dickmeiss [Tue, 19 Jun 2007 19:42:17 +0000 (19:42 +0000)]
Fix bug no in NEWS
Adam Dickmeiss [Tue, 19 Jun 2007 19:39:54 +0000 (19:39 +0000)]
When mod.dom alwo sets additional info when it returns diagnostic
'Specified element set name not valid for specified database'.
Adam Dickmeiss [Sun, 17 Jun 2007 07:06:03 +0000 (07:06 +0000)]
Removed definition of docdir. It is set by automake already
Adam Dickmeiss [Fri, 25 May 2007 14:05:52 +0000 (14:05 +0000)]
Flush the iconv sequence for each sequence in a .chr file. This fixes
nothing because the ICONV handles in use always flushed themselves. They
may not always do that, so we must do it.
Adam Dickmeiss [Fri, 25 May 2007 13:46:01 +0000 (13:46 +0000)]
Cosmetic
Adam Dickmeiss [Fri, 25 May 2007 12:17:11 +0000 (12:17 +0000)]
Fixed bug #1142: Non-indexed but listed attributes issues diagnostic.
We keep the existing behavior by default and continue to issue a
diagnostic. Typically there are many attributes given in .att-files which
are never used in a Zebra installation. If they issue 0 hits, then
most Zebra servers will basically lie about their capabilities. It
would also confuse a lot of users.. (Dead programs tell no lies).
But it is certainly useful to be able to say "allow unknown use
attribtute" in controlled environments and in multi server systems
(where attributes may not indexed in all places) . Zebra now allows
attribute 14 to control this. 14=0 makes Zebra works as usual (throw
a diagnostic). 14=1 makes Zebra produce 0 hits (for the leaf/APT)
Adam Dickmeiss [Thu, 24 May 2007 13:44:09 +0000 (13:44 +0000)]
Using acro. entities. Replaced some it's to its (where appropriate).
Adam Dickmeiss [Tue, 22 May 2007 11:12:53 +0000 (11:12 +0000)]
Use entity idcommon rather than common
Adam Dickmeiss [Mon, 21 May 2007 11:54:59 +0000 (11:54 +0000)]
Zebra returns character encoding as part of init response even if
client does not suggest one.
Adam Dickmeiss [Mon, 21 May 2007 11:53:49 +0000 (11:53 +0000)]
Simplify data1 character set conversion
Adam Dickmeiss [Mon, 21 May 2007 08:23:32 +0000 (08:23 +0000)]
More news
Adam Dickmeiss [Mon, 21 May 2007 07:15:06 +0000 (07:15 +0000)]
Fixed bug #1132: tstcharmap fails on flurry (amd64). Problem was missing
include of stdlib.h.
Adam Dickmeiss [Sat, 19 May 2007 19:44:14 +0000 (19:44 +0000)]
Fixed bug #1131: Missing value-of data in DOM filter. The problem was
that the internally generated MARCXML did not have the namespace
"http://www.loc.gov/MARC21/slim" declared.
Adam Dickmeiss [Wed, 16 May 2007 12:31:17 +0000 (12:31 +0000)]
Use YAZ_CHECK_LOG consistently.
Adam Dickmeiss [Wed, 16 May 2007 10:58:19 +0000 (10:58 +0000)]
Ignore tstres
Adam Dickmeiss [Wed, 16 May 2007 10:57:05 +0000 (10:57 +0000)]
Fixed bug #1049: zebra.cfg lines with leading space are ignored. Res
system now uses YAZ' tokenizer. Added test for this (tstres.c).
Adam Dickmeiss [Wed, 16 May 2007 10:50:03 +0000 (10:50 +0000)]
Bump to 2.0.15. Require YAZ 3.0.3
Adam Dickmeiss [Wed, 16 May 2007 08:57:26 +0000 (08:57 +0000)]
Indentation. Removed a few unused functions.
Adam Dickmeiss [Wed, 16 May 2007 08:46:19 +0000 (08:46 +0000)]
Fixed bug #1128: sortmax not honored.
Adam Dickmeiss [Mon, 14 May 2007 14:05:21 +0000 (14:05 +0000)]
Factor common work in term_100, ..term_105 into private function
add_non_space.
Adam Dickmeiss [Mon, 14 May 2007 13:21:32 +0000 (13:21 +0000)]
Added string relation tests.
Adam Dickmeiss [Mon, 14 May 2007 12:33:33 +0000 (12:33 +0000)]
Fixed bug #1121: Crash for some searches with customized string.chr.
This involves longish but trivial use or WRBUf instead of static buffer -
problem is the heavy use equivalent directives in the customized
string.chr.
Adam Dickmeiss [Wed, 9 May 2007 07:42:46 +0000 (07:42 +0000)]
Fix dup YAZ_EXPORT
Adam Dickmeiss [Wed, 9 May 2007 07:36:06 +0000 (07:36 +0000)]
Towards 2.0.14.
Adam Dickmeiss [Wed, 9 May 2007 07:30:53 +0000 (07:30 +0000)]
Notes on version update.
Adam Dickmeiss [Wed, 9 May 2007 07:30:43 +0000 (07:30 +0000)]
m4 quoting
Adam Dickmeiss [Wed, 9 May 2007 07:07:18 +0000 (07:07 +0000)]
Fixed bug #1114: scan within set may use excessive CPU.
Adam Dickmeiss [Tue, 8 May 2007 14:55:30 +0000 (14:55 +0000)]
Use YAZ 3 lib,dll
Adam Dickmeiss [Tue, 8 May 2007 14:49:38 +0000 (14:49 +0000)]
Issue diagnostic for scan if 'set' in @attr 8=set does not exist.
Fixed memory leak in rpn_scan (unreleased WRBUF for each term).
Adam Dickmeiss [Tue, 8 May 2007 14:27:23 +0000 (14:27 +0000)]
Display match string if log level "extract" is used.
Adam Dickmeiss [Tue, 8 May 2007 12:50:03 +0000 (12:50 +0000)]
Use Odr_oid for OIDs. Require YAZ 3.0.2 or later.
Adam Dickmeiss [Thu, 3 May 2007 07:20:19 +0000 (07:20 +0000)]
Require YAZ 3
Adam Dickmeiss [Wed, 25 Apr 2007 09:38:21 +0000 (09:38 +0000)]
Allow safari filter to specify index type.
Adam Dickmeiss [Wed, 25 Apr 2007 08:22:01 +0000 (08:22 +0000)]
log optimized on level 'extrat'; details in 'details'
Adam Dickmeiss [Wed, 25 Apr 2007 08:18:01 +0000 (08:18 +0000)]
Return proper EOF for safari filter
Adam Dickmeiss [Wed, 18 Apr 2007 11:37:39 +0000 (11:37 +0000)]
Dont use nmem_init, nmem_exit
Adam Dickmeiss [Tue, 17 Apr 2007 20:27:14 +0000 (20:27 +0000)]
Update for YAZ 3s libyaz_server.la
Adam Dickmeiss [Mon, 16 Apr 2007 21:54:37 +0000 (21:54 +0000)]
Another and hopefully, last, YAZ OID DB update
Adam Dickmeiss [Mon, 16 Apr 2007 08:44:31 +0000 (08:44 +0000)]
Update for YAZ 3s new OID system
Adam Dickmeiss [Sat, 7 Apr 2007 22:26:27 +0000 (22:26 +0000)]
Changed extract code so that it optimizes updates of records where content
is almost identical to previous version of record. This makes updating of
the internal explain database faster too. Also fixed memory leak that
occurred for each deleted record.
Adam Dickmeiss [Sat, 7 Apr 2007 22:24:12 +0000 (22:24 +0000)]
Fixed bad memory reference that could occur if empty key block was to
be sorted.
Adam Dickmeiss [Sat, 7 Apr 2007 22:18:46 +0000 (22:18 +0000)]
Remove leading blank line
Adam Dickmeiss [Tue, 3 Apr 2007 16:54:46 +0000 (16:54 +0000)]
Fixed bug #1017: assert failure in isamb for delete of records. Problem
was that root ptr of sort ISAMB was not properly flushed to disk when it
changed.
Adam Dickmeiss [Tue, 3 Apr 2007 15:26:14 +0000 (15:26 +0000)]
Make directory config it it is not there
Adam Dickmeiss [Mon, 2 Apr 2007 16:57:08 +0000 (16:57 +0000)]
Removed a few YLOG_LOG messages. This could be enough to fix bug #1012.
Adam Dickmeiss [Wed, 21 Mar 2007 19:37:15 +0000 (19:37 +0000)]
Update with latest changes.
Adam Dickmeiss [Wed, 21 Mar 2007 19:37:00 +0000 (19:37 +0000)]
Describe the @type action for DOM filter
Adam Dickmeiss [Wed, 21 Mar 2007 19:36:47 +0000 (19:36 +0000)]
Minor change in link to CQL material in YAZ
Adam Dickmeiss [Wed, 21 Mar 2007 13:47:12 +0000 (13:47 +0000)]
For RPN queries the index type (w,p,..) may be specified verbatim
as structure attribute with string value, e.g. @attr 4=w .
Adam Dickmeiss [Tue, 20 Mar 2007 22:42:19 +0000 (22:42 +0000)]
ChangeLog in dist
Adam Dickmeiss [Tue, 20 Mar 2007 22:07:35 +0000 (22:07 +0000)]
Use yaz_iconv flushing.
Adam Dickmeiss [Tue, 20 Mar 2007 22:07:21 +0000 (22:07 +0000)]
Remove debug msg
Adam Dickmeiss [Mon, 19 Mar 2007 21:57:25 +0000 (21:57 +0000)]
Use non-const char return value for strtok work
Adam Dickmeiss [Mon, 19 Mar 2007 21:50:39 +0000 (21:50 +0000)]
WRBUF updates.
Adam Dickmeiss [Wed, 14 Mar 2007 14:16:14 +0000 (14:16 +0000)]
Changed some types in mod_dom.c ; mostly 'xmlChar *' to 'const char *'.
The use of const is more appropriate than non-const becuase these
string references point to xmlNode content - and we are not allowed
to change that. Added buffer safe PI attribute reading for mod_dom.c by
implementing function attr_content_pi. Function index_value_of still has
potential buffer flows. The record extraction system now has a new member,
action, which may be modified by a record filter to signal
delete/replace/insert. This is only honoured if update is used (in which
case the outer system already has said "we don't care whether it's insert
or replace anyway). Added mod_dom test for the use for @type=delete .
Adam Dickmeiss [Wed, 14 Mar 2007 11:48:31 +0000 (11:48 +0000)]
Changed record update API . It is now handled by function
zebra_record_update which does insert/replace/delete/update of
records . This function replaces zebra_record_{insert,delete} and
zebra_admin_exchange_record.
Adam Dickmeiss [Tue, 13 Mar 2007 13:46:11 +0000 (13:46 +0000)]
Fixed bug #944: Allow extraction of multiple records per ES update.
Based on patch from Hans-Werner Hilse.
Adam Dickmeiss [Thu, 8 Mar 2007 21:07:45 +0000 (21:07 +0000)]
Debian package 2.0.13-1
Marc Cromme [Thu, 8 Mar 2007 17:19:12 +0000 (17:19 +0000)]
changed <dom> and <input> parser such that the following conditions actually work:
1) no <input> element at all
2) empty <input> element
3) <input> element starting with an <xslt> instruction (that is, <xmlreader> and/or <marc> not mandatory any more.
Needed to make new define DOM_INPUT_DOM besides DOM_INPUT_MARC and DOM_INPUT_XMLREADER
Still missing detection of <xmlreader> or <marc> after all <xslt> nodes.
And more important: when finding errors here, it's kind of lam just to emit an warning, one should stop processing!
Adam Dickmeiss [Thu, 8 Mar 2007 13:18:35 +0000 (13:18 +0000)]
For MARC indexing, skip until record separator is met.
Adam Dickmeiss [Thu, 8 Mar 2007 12:57:35 +0000 (12:57 +0000)]
Bump to 2.0.13
Marc Cromme [Thu, 8 Mar 2007 11:29:16 +0000 (11:29 +0000)]
corrected typo
Marc Cromme [Thu, 8 Mar 2007 11:24:50 +0000 (11:24 +0000)]
added example of MARCXML indexing with chopping of sort indexes cccording to 'ind2' field containing integer
Adam Dickmeiss [Wed, 7 Mar 2007 21:25:29 +0000 (21:25 +0000)]
Added mod_dom to win32 makefile
Adam Dickmeiss [Wed, 7 Mar 2007 21:14:15 +0000 (21:14 +0000)]
Towards 2.0.12
Adam Dickmeiss [Wed, 7 Mar 2007 21:08:36 +0000 (21:08 +0000)]
Fixed bug with indexing of attributes for rec.grs-class of filters. If
xpath was enabled xelm a/@b would be ignored.
Marc Cromme [Wed, 7 Mar 2007 14:18:35 +0000 (14:18 +0000)]
Added always the XML parsing flag XML_PARSE_NONET to any XML_PARSE_XINCLUDE to avoid spoofing Zebra to fetch megabyte from an external xincluded url. pretty normal safety thing to do, we just did forget before.
Marc Cromme [Wed, 7 Mar 2007 13:05:20 +0000 (13:05 +0000)]
removed documentation of non-working 'insert', 'update' 'delete' functionality in Alvis filter
removed 'update' instruction from example OAI indexing stylesheet
Adam Dickmeiss [Tue, 6 Mar 2007 12:40:18 +0000 (12:40 +0000)]
Fixed bug #931: lem 'zebra::index::field' hangs if not specified 'storeKeys: 1' in zebra.cfg.
Adam Dickmeiss [Tue, 6 Mar 2007 12:21:04 +0000 (12:21 +0000)]
Fixed bug #943: Searches with localid always find a hit.
Adam Dickmeiss [Tue, 6 Mar 2007 12:09:44 +0000 (12:09 +0000)]
Avoid mixed stmt/var declare
Marc Cromme [Tue, 6 Mar 2007 09:24:34 +0000 (09:24 +0000)]
added missing extra dist target
Adam Dickmeiss [Tue, 6 Mar 2007 08:48:57 +0000 (08:48 +0000)]
Fixed bug #946: Coredump on MARC display.
Adam Dickmeiss [Tue, 6 Mar 2007 08:23:24 +0000 (08:23 +0000)]
Added missing xsl for dom1 test.
Marc Cromme [Mon, 5 Mar 2007 13:02:11 +0000 (13:02 +0000)]
added tests for bug #883 'Need an 'ignore' value for the z:type
attribute in the canonical indexing format'
resolved bug #883
tested as well on gutenberg collection
zebra-setup/gutenberg
case closed, see
http://bugzilla.indexdata.dk/show_bug.cgi?id=883
Adam Dickmeiss [Sat, 3 Mar 2007 21:39:10 +0000 (21:39 +0000)]
Fixes for perform_convert: use xmlParseMemory instead of xmlParseMemory
to avoid reading beyond end of buffer. Ensure conversions are stopped
if XSLT conversion fail(s).
Marc Cromme [Thu, 1 Mar 2007 11:21:20 +0000 (11:21 +0000)]
removed section on special record retrieval features, which need a rewrite - only commented out.
added section on debugging of DOM filter configurations
added a bullet point on semantics of DOM filter explaining that records not emerging record and index instructions are discarted, i.e. dropped on the floor. This meets Seb's wishes for the gutenberg collection
Marc Cromme [Thu, 1 Mar 2007 11:18:40 +0000 (11:18 +0000)]
removed quick start and examples, which are very GRS-1 centric.
These need re-writing in terms of the DOM filter
Adam Dickmeiss [Thu, 1 Mar 2007 10:35:46 +0000 (10:35 +0000)]
Allow record filters to return 'skip' this record (RECCTRL_EXTRACT_SKIP).
Make dom filter return 'skip' if no zebra 'record' node exists in
indexing document. Bug #883.
Adam Dickmeiss [Wed, 28 Feb 2007 18:43:06 +0000 (18:43 +0000)]
Fix handling of record retrieval in the case of open failure of external
record file (storedata:0).
Marc Cromme [Wed, 28 Feb 2007 16:46:19 +0000 (16:46 +0000)]
added nice debug output of all xmlreader and xslt XML stuff when running with
zebra/index/zebraidx -c zebra.cfg -s update water.rdf
Don't do thins on huge data - the logs will be at least 4-6 times the size of the input data !!
Marc Cromme [Wed, 28 Feb 2007 14:46:41 +0000 (14:46 +0000)]
closing bug #928 by dropping DOM document to xmlbuffer and re-reading into DOM each time a XSLT transform did occur. Yes, ugly, ugly, but no other possibility.
Added output of XML after each transformation on YLOG_DEBUG level, run indexer with '-v debug' to see all transformations
Marc Cromme [Wed, 28 Feb 2007 13:16:24 +0000 (13:16 +0000)]
removed general warning log of indexing process. this can be seen by running the indexer with '-v debug' anyhow.
Adam Dickmeiss [Mon, 26 Feb 2007 16:12:24 +0000 (16:12 +0000)]
Avoid sprintf with NULL %s value (Solaris dislikes it)
Adam Dickmeiss [Sat, 24 Feb 2007 17:05:40 +0000 (17:05 +0000)]
Fixed bug #929: Unfinished transaction in non-shadow does not get a
warn.
Adam Dickmeiss [Sat, 24 Feb 2007 16:47:16 +0000 (16:47 +0000)]
Deal with two common places for corrupt Explain database
Adam Dickmeiss [Sat, 24 Feb 2007 16:46:22 +0000 (16:46 +0000)]
Proper cleanup (isamb_close) for bad headers
Adam Dickmeiss [Fri, 23 Feb 2007 14:59:12 +0000 (14:59 +0000)]
Use xmlGetLineNo instead of xmlGetNodePath for errors/warnings