Package com.logicaldoc.core.parser
Class HTMLParser
- java.lang.Object
-
- com.logicaldoc.core.parser.AbstractParser
-
- com.logicaldoc.core.parser.HTMLParser
-
- All Implemented Interfaces:
Parser
public class HTMLParser extends AbstractParser
Text extractor for HyperText Markup Language (HTML).- Since:
- 3.5
- Author:
- Michael Scholz, Alessandro Gasparini - LogicalDOC
-
-
Constructor Summary
Constructors Constructor Description HTMLParser()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
internalParse(InputStream input, String filename, String encoding, Locale locale, String tenant, Document document, String fileVersion, StringBuffer content)
String
parse(File file, String filename, String encoding, Locale locale, String tenant, Document document, String fileVersion)
Same asParser.parse(InputStream, String, String, Locale, String, Document, String)
, but use this when you have a file rather than a stream.-
Methods inherited from class com.logicaldoc.core.parser.AbstractParser
countPages, countPages, parse, parse, parse
-
-
-
-
Method Detail
-
parse
public String parse(File file, String filename, String encoding, Locale locale, String tenant, Document document, String fileVersion)
Description copied from interface:Parser
Same asParser.parse(InputStream, String, String, Locale, String, Document, String)
, but use this when you have a file rather than a stream.- Specified by:
parse
in interfaceParser
- Overrides:
parse
in classAbstractParser
- Parameters:
file
- the filefilename
- name of the fileencoding
- character encodinglocale
- the localetenant
- name of the tenantdocument
- the document the file belongs to (optional)fileVersion
- the file version being processed (optional)- Returns:
- the extracted text
-
internalParse
public void internalParse(InputStream input, String filename, String encoding, Locale locale, String tenant, Document document, String fileVersion, StringBuffer content)
-
-