Package com.logicaldoc.core.parser
Machinery for parsing different file formats. Implementations of the
The parsers are used by the full-text engine to extract the contents for indexing your documents
Parser
are designed to read the content of
a specific file type.The parsers are used by the full-text engine to extract the contents for indexing your documents
- Since:
- 1.0
-
Interface Summary Interface Description Parser A Parser is capable of parsing a content in order to extract the texts within it. -
Class Summary Class Description AbiWordParser Text extractor for AbiWord documents.AbstractParser Abstract implementation of a ParserDOCParser Parses a MS Word (*.doc, *.dot) file to extract the text contained in the file.DummyParser Parser that doesn't parse anythingHTMLParser Text extractor for HyperText Markup Language (HTML).HTMLSAXParser Helper class for HTML parsingKOfficeParser Text extractor for KOffice 1.6 documents.OpenOfficeParser Text extractor for OpenOffice/OpenDocument documents.ParserFactory This is a factory, returning a parser instance for the given file.ParseTask Collects all the informations needed to execute the parsing of a filePDFParser Text extractor for Portable Document Format (PDF).PPTParser Parser for Office 2003 presentationsPSParser RTFParser TXTParser Class for parsing text (*.txt) files.XLSParser Parser for Office 2003 worksheetsXMLParser Text extractor for XML documents.ZABWParser Text extractor for AbiWord compressed documents.