Package com.logicaldoc.core.parser
package com.logicaldoc.core.parser
Machinery for parsing different file formats. Implementations of the
The parsers are used by the full-text engine to extract the contents for indexing your documents and calculating the number of pages.
Parser
are designed to read the content of
a specific file type.The parsers are used by the full-text engine to extract the contents for indexing your documents and calculating the number of pages.
- Since:
- 1.0
-
ClassDescriptionText extractor for AbiWord documents.Abstract implementation of a ParserParser that tries to convert the document into PDF and then tries to parse itParses a MS Word (*.doc, *.dot) file to extract the text contained in the file.Parser that doesn't parse anythingA specialized parser to extract text from .epub(e-books) formatText extractor for HyperText Markup Language (HTML).Text extractor for KOffice 1.6 documents.Text extractor for the Markdown language.Text extractor for OpenOffice/OpenDocument documents.Some parameters to parse documentsA Parser is capable of parsing a content in order to extract the texts and other metadata within it.This is a factory, returning a parser instance for the given file.When an error happens during the parsingA parsing error due to timeoutText extractor for Portable Document Format (PDF).Parser for Office 2003 presentationsClass for parsing rar files.A parser for the Rich Text FormatClass for parsing 7z files.Class for parsing tar files.Class for parsing text (*.txt) files.Parser for Office 2003 worksheetsText extractor for XML documents.Text extractor for AbiWord compressed documents.Class for parsing zip files.