Class ZonalOCR

java.lang.Object
com.logicaldoc.zonalocr.ZonalOCR

@Component("zonalOCR") public class ZonalOCR extends Object
The Zonal OCR engine
Since:
8.4.2
Author:
Marco Meschieri - LogicalDOC
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    extractZoneText(File scan, Zone zone, Locale locale, String tenant, boolean updateSample)
    Uses the OCR to extract the text from the given zone
    void
    processDocument(long docId, com.logicaldoc.core.document.DocumentHistory transaction)
    Processes a document using a given OCR template, the first page is elaborated and the extracted zones are used to fill the document's extended attributes.
    processFile(File scan, OCRTemplate template, com.logicaldoc.core.document.Document document, com.logicaldoc.core.document.DocumentHistory transaction)
    Processes a file using a given template

    Methods inherited from class java.lang.Object

    equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • ZonalOCR

      public ZonalOCR()
  • Method Details

    • processDocument

      public void processDocument(long docId, com.logicaldoc.core.document.DocumentHistory transaction) throws com.logicaldoc.core.PersistenceException, IOException
      Processes a document using a given OCR template, the first page is elaborated and the extracted zones are used to fill the document's extended attributes.
      Parameters:
      docId - the document to process
      transaction - informations about the operation
      Throws:
      com.logicaldoc.core.PersistenceException - Error in the persistence layer
      IOException - I/O error
    • processFile

      public Map<String,Object> processFile(File scan, OCRTemplate template, com.logicaldoc.core.document.Document document, com.logicaldoc.core.document.DocumentHistory transaction) throws com.logicaldoc.core.PersistenceException
      Processes a file using a given template
      Parameters:
      scan - the image file to process
      template - the OCR template that describes the zones
      document - the document being processed
      transaction - informations about the operation
      Returns:
      the map zone_name-zone_value
      Throws:
      com.logicaldoc.core.PersistenceException - Error in the data layer
    • extractZoneText

      public String extractZoneText(File scan, Zone zone, Locale locale, String tenant, boolean updateSample) throws IOException
      Uses the OCR to extract the text from the given zone
      Parameters:
      scan - the original scan file
      zone - the zone to extract
      locale - a locale to use as hint for the OCR, if null the zone's one will be used instead
      tenant - name of the current tenant
      updateSample - update the sample of the zone(sample and sampleText attributes)
      Returns:
      The text extracted by the OCR
      Throws:
      IOException - an error occurred processing the image or executing the OCR