Package com.logicaldoc.core.parser
Machinery for parsing different file formats. Implementations of the
The parsers are used by the full-text engine to extract the contents for indexing your documents
Parser
are designed to read the content of
a specific file type.The parsers are used by the full-text engine to extract the contents for indexing your documents
- Since:
- 1.0
-
Interface Summary Interface Description Parser A Parser is capable of parsing a content in order to extract the texts within it. -
Class Summary Class Description AbiWordParser Text extractor for AbiWord documents.AbstractParser Abstract implementation of a ParserDOCParser Parses a MS Word (*.doc, *.dot) file to extract the text contained in the file.DummyParser Parser that doesn't parse anythingEpubParser A specialized parser to extract text from .epub(e-books) formatHTMLParser Text extractor for HyperText Markup Language (HTML).HTMLSAXParser Helper class for HTML parsingKOfficeParser Text extractor for KOffice 1.6 documents.OpenOfficeParser Text extractor for OpenOffice/OpenDocument documents.ParserFactory This is a factory, returning a parser instance for the given file.ParseTask Collects all the informations needed to execute the parsing of a filePDFParser Text extractor for Portable Document Format (PDF).PPTParser Parser for Office 2003 presentationsPSParser RTFParser TXTParser Class for parsing text (*.txt) files.XLSParser Parser for Office 2003 worksheetsXMLParser Text extractor for XML documents.ZABWParser Text extractor for AbiWord compressed documents.ZipParser Class for parsing text (*.txt) files.