Package com.logicaldoc.ocr
Class OCR
java.lang.Object
com.logicaldoc.ocr.OCR
- Direct Known Subclasses:
OCRWebService
,PowerPDF
,Tesseract
This OCR engine is capable of recognizing characters (letter and numbers)
accurately
- Author:
- Alessandro Gasparini
-
Method Summary
Modifier and TypeMethodDescriptionvoid
extractPDFText
(File pdffile, String lang, String tenant, StringBuilder buffer, OCRHistory transaction) Extracts the text from PDF filevoid
extractText
(File imgfile, String lang, String tenant, StringBuilder sb, OCRHistory transaction) getParameter
(String name) int
getResolutionThreshold
(String tenant) boolean
static boolean
void
-
Method Details
-
loadParameters
public void loadParameters() -
getParameters
-
getParameter
-
getParameterNames
-
isAvailable
public boolean isAvailable() -
extractPDFText
public void extractPDFText(File pdffile, String lang, String tenant, StringBuilder buffer, OCRHistory transaction) throws IOException Extracts the text from PDF file- Parameters:
pdffile
- the file to ocrlang
- the language in which the document is writtentenant
- name of the tenantbuffer
- the buffer to store the extracted texttransaction
- informations about the indexing transaction- Throws:
IOException
- In case of OCR error
-
extractText
public void extractText(File imgfile, String lang, String tenant, StringBuilder sb, OCRHistory transaction) throws IOException - Throws:
IOException
-
getResolutionThreshold
-
isWindows
public static boolean isWindows()
-