Skip to main content

Zonal OCR

The Zonal OCR allows you to define zones into a scan that will be processed in order to extract specific fields and store them as extended attributes of the document.

Because of each zone refers to a specific attribute of a template you have to chose a template first.

Once you have selected the document template, you can then select one of the available OCR templates or create a new one by clicking on New.

The reason why you can have a set of different OCR templates for a single document template is because the zones' positions heavily depend on the scan morphology, so for instance if you want to process invoices, the attributes(number, date etc.) are the same but each supplier has it's own form and the data are printed in different locations.

Create a new OCR Template

Just click on New button; in the popup window define the name of the new OCR template(it will be used as unique identifier) and upload a sample scan that will be used to visually define the zones(it is suggested to use a .jpg or .pdf file).

Now you see the sample is displayed in the visual editor.

Zonal OCR Visual Editor

Create a new Zone

On the scan sample you have to design the zones for extracting the fields, to create a zone click on Add zone. Then select what field will be assigned to the new zone.

A semi transparent yellow rectangle appears in the upper left corner of the scan, that rectangle represents the zone's boundaries you can resize and position over the field you want to process.

Zonal OCR Zone

Once you have correctly positioned the zone, double click on it to open the details.

Zonal OCR Zone Details

Here, depending on the field's type you may have more or less settings, in case of a numeric field you can specify a format and the symbols used for decimal and format separators.

In case of numbers or dates, the format can be composed by these characters:

Pattern characters for numbers
0 Digit
# Digit, zero shows as absent
. Decimal separator or monetary decimal separator
, Grouping separator
- Minus sign
Pattern characters for dates
y Year
M Month in year
d Day in month
 

In any case for each type of field you can define an automation script that will be executed when the zone is elaborated on a specific document.

AUTOMATION VARIABLES FOR ZONAL OCR
document Document the document being processed
sample   the text extracted from the zone
value   the value object(String, Date, Decimal ...) converted from the sample
zone Zone the zone being processed, use zone.value if you want to change the value that will be saved in the document
 

Assign an OCR Template to the documents

In order to allow the Zonal OCR to process your documents you have to assign them an OCR Template, this can be done in the OCR tab of the document's detail panel.

Once you have assigned the template you can immediately execute the Zonal OCR by pressing the Process button otherwise the Zonal OCR Processor scheduled task will take care of doing so later.

Processing queue

In this panel you can see all documents not already processed by the Zonal OCR. You can make unprocessable a document by right clicking on the item and then selecting the Mark as unprocessable option.

Zonal OCR Processing Queue