Class AbstractParser
- java.lang.Object
-
- com.logicaldoc.core.parser.AbstractParser
-
- All Implemented Interfaces:
Parser
- Direct Known Subclasses:
AbiWordParser
,DummyParser
,HTMLParser
,KOfficeParser
,OpenOfficeParser
,PDFParser
,PPTParser
,PSParser
,RTFParser
,TXTParser
,WordPerfectParser
,XLSParser
,XMLParser
,ZABWParser
,ZipParser
public abstract class AbstractParser extends Object implements Parser
Abstract implementation of a Parser- Since:
- 3.5
- Author:
- Marco Meschieri - LogicalDOC
-
-
Constructor Summary
Constructors Constructor Description AbstractParser()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description String
parse(File file, String filename, String encoding, Locale locale, String tenant)
Same as the other method that accept an input stream, use this when you have a file rather than a stream.String
parse(InputStream input, String filename, String encoding, Locale locale, String tenant)
* Extracts content for the text content of the given binary document.
-
-
-
Method Detail
-
parse
public String parse(File file, String filename, String encoding, Locale locale, String tenant)
Description copied from interface:Parser
Same as the other method that accept an input stream, use this when you have a file rather than a stream.
-
parse
public String parse(InputStream input, String filename, String encoding, Locale locale, String tenant)
Description copied from interface:Parser
* Extracts content for the text content of the given binary document. The content type and character encoding (if available and applicable) are given as arguments.The implementation can choose either to read and parse the given document immediately or to return a reader that does it incrementally. The only constraint is that the implementation must close the given stream latest when the returned reader is closed. The caller on the other hand is responsible for closing the returned reader.
The implementation should only throw an exception on transient errors, i.e. when it can expect to be able to successfully extract the text content of the same binary at another time. An effort should be made to recover from syntax errors and other similar problems.
This method should be thread-safe, i.e. it is possible that this method is invoked simultaneously by different threads to extract the text content of different documents. On the other hand the returned reader does not need to be thread-safe.
The parsing has to be completed before the seconds specified in the parser.timeout config. property.
-
-