Class ParserFactory


  • public class ParserFactory
    extends Object
    This is a factory, returning a parser instance for the given file.
    Author:
    Michael Scholz
    • Constructor Detail

      • ParserFactory

        public ParserFactory()
    • Method Detail

      • init

        public static void init()
        Registers all parsers from extension points
      • parse

        public static String parse​(InputStream input,
                                   String filename,
                                   String encoding,
                                   Locale locale,
                                   long tenantId,
                                   Document document,
                                   String fileVersion)
                            throws ParseException
        Gets the proper parser and parse the given content
        Parameters:
        input - the input contents as stream
        filename - name of the file
        encoding - encoding of the stream
        locale - the locale
        tenantId - identifier of the tenant
        document - the document the file belongs to (optional)
        fileVersion - the file version being processed (optional)
        Returns:
        the text extracted from the input
        Throws:
        ParseException - error in the parsing
      • getParser

        public static Parser getParser​(String filename)
        Method containing the lookup logic
        Parameters:
        filename - name of the file
        Returns:
        the right parser for the given file name
      • getExtensions

        public static Set<String> getExtensions()
      • setAliases

        public static void setAliases​(String ext,
                                      String[] aliases)
        Adds new aliases for the specified extension.

        Each alias is saved as property parser.alias.<ext>
        example: parser.alias.odt = test, acme
        In this case an extension 'test' will be treated as 'odt'

        Parameters:
        ext - Must be one of the registered extensions
        aliases - Array of extension aliases (eg. test, acme ...)