|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.marc4j.MarcPermissiveStreamReader
public class MarcPermissiveStreamReader
An iterator over a collection of MARC records in ISO 2709 format, that is designed to be able to handle MARC records that have errors in their structure or their encoding. If the permissive flag is set in the call to the constructor, or if a ErrorHandler object is passed in as a parameter to the constructor, this reader will do its best to detect and recover from a number of structural or encoding errors that can occur in a MARC record. Note that if this reader is not set to read permissively, its will operate pretty much identically to the MarcStreamReader class. Note that no attempt is made to validate the contents of the record at a semantic level. This reader does not know and does not care whether the record has a 245 field, or if the 008 field is the right length, but if the record claims to be UTF-8 or MARC8 encoded and you are seeing gibberish in the output, or if the reader is throwing an exception in trying to read a record, then this reader may be able to produce a usable record from the bad data you have. The ability to directly translate the record to UTF-8 as it is being read in is useful in cases where the UTF-8 version of the record will be used directly by the program that is reading the MARC data, for instance if the marc records are to be indexed into a SOLR search engine. Previously the MARC record could only be translated to UTF-8 as it was being written out via a MarcStreamWriter or a MarcXmlWriter.
Example usage:
InputStream input = new FileInputStream("file.mrc"); MarcReader reader = new MarcPermissiveStreamReader(input, true, true); while (reader.hasNext()) { Record record = reader.next(); // Process record }
Check the org.marc4j.marc
package for examples about the use of
the Record
object model.
Check the file org.marc4j.samples.PermissiveReaderExample.java for an
example about using the MarcPermissiveStreamReader in conjunction with the
ErrorHandler class to report errors encountered while processing records.
When no encoding is given as an constructor argument the parser tries to resolve the encoding by looking at the character coding scheme (leader position 9) in MARC21 records. For UNIMARC records this position is not defined. If the reader is operating in permissive mode and no encoding is given as an constructor argument the reader will look at the leader, and also at the data of the record to determine to the best of its ability what character encoding scheme has been used to encode the data in a particular MARC record.
Constructor Summary | |
---|---|
MarcPermissiveStreamReader(InputStream input,
boolean permissive,
boolean convertToUTF8)
Constructs an instance with the specified input stream with possible additional functionality being enabled by setting permissive and/or convertToUTF8 to true. |
|
MarcPermissiveStreamReader(InputStream input,
boolean permissive,
boolean convertToUTF8,
String defaultEncoding)
Constructs an instance with the specified input stream with possible additional functionality being enabled by setting permissive and/or convertToUTF8 to true. |
|
MarcPermissiveStreamReader(InputStream input,
ErrorHandler errors,
boolean convertToUTF8)
Constructs an instance with the specified input stream with possible additional functionality being enabled by passing in an ErrorHandler object and/or setting convertToUTF8 to true. |
|
MarcPermissiveStreamReader(InputStream input,
ErrorHandler errors,
boolean convertToUTF8,
String defaultEncoding)
Constructs an instance with the specified input stream with possible additional functionality being enabled by setting permissive and/or convertToUTF8 to true. |
Method Summary | |
---|---|
boolean |
hasNext()
Returns true if the iteration has more records, false otherwise. |
Record |
next()
Returns the next record in the iteration. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public MarcPermissiveStreamReader(InputStream input, boolean permissive, boolean convertToUTF8)
public MarcPermissiveStreamReader(InputStream input, ErrorHandler errors, boolean convertToUTF8)
public MarcPermissiveStreamReader(InputStream input, boolean permissive, boolean convertToUTF8, String defaultEncoding)
public MarcPermissiveStreamReader(InputStream input, ErrorHandler errors, boolean convertToUTF8, String defaultEncoding)
Method Detail |
---|
public boolean hasNext()
hasNext
in interface MarcReader
public Record next()
next
in interface MarcReader
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |