Class OCR

  • Direct Known Subclasses:
    Advanced, OCRWebService, PowerPDF, Tesseract

    public abstract class OCR
    extends Object
    This OCR engine is capable of recognizing characters (letter and numbers) accurately
    Author:
    Alessandro Gasparini
    • Constructor Detail

      • OCR

        public OCR()
    • Method Detail

      • loadParameters

        public void loadParameters()
      • getParameter

        public String getParameter​(String name)
      • getParameterNames

        public List<String> getParameterNames()
      • isAvailable

        public boolean isAvailable()
      • extractPDFText

        public void extractPDFText​(File pdffile,
                                   String lang,
                                   String tenant,
                                   StringBuilder buffer)
                            throws IOException
        Extracts the text from PDF file
        Parameters:
        pdffile - the file to ocr
        lang - the language in which the document is written
        tenant - name of the tenant
        buffer - the buffer to store the extracted text
        Throws:
        IOException - In case of OCR error
      • getResolutionThreshold

        public int getResolutionThreshold​(String tenant)
      • isWindows

        public static boolean isWindows()