Line Items
The EzeScan LINE ITEMS module is an optional module that requires EzeScan PRO + KFI + DISCOVERY modules.
It's fully integrated within the EzeScan product
Uses Intelligent Grid Row searching to find information based on different grid column layouts
Popular for Supplier Invoices i.e. extracting line item information like Description, Code, quantity, Price etc
Can be used on invoices with variable layout i.e. grid position moves and the number of grid columns and rows is not fixed
Line item row/column data can be output in a .CSV or .XML format
CSV data comprises 1 optional row of heading row data, followed by multiple rows of line items data.
data can also be output via EzeScan KFI output.
It enables the capture of line items in any document; for example:
Invoices
Index Sheets
Lab Test Results
This document is based on the capture of invoice data but may be applied to other types as indicated above.
Common terms used in this document
Term | Meaning |
Jobs | The EzeScan module used to scan or import documents into the EzeScan application. |
KFI | Key From Index. The KFI module is a mandatory requirement for line items. It is also popularly known as Profiling. |
Discovery | An EzeScan module. This is necessary for the capture of items within a document |
Grid | The lines which make up the segregated data (i.e. a table) |
Line Items | The rows within the grid |
Header | The top row of the grid. (i.e. the table heading) |
REGEX | Regular Expressions used in EzeScan to refine captured values or assist in capturing values |
Line Items in a "nutshell". What does it do?
The EzeScan Discovery module uses the EzeScan Advanced OCR module to provide EzeScan with the capability to use a single EzeScan Form Template to process documents that contain similar data but have different form layouts.
Discovery is ideally suited to processing Supplier Invoices where the invoices look different but they contain similar fields (e.g. Supplier Name, Invoice Number, Invoice Date and Invoice Amount). It can also be used to process forms where the form data is moving around because of page margin differences due to offset printing, scanner settings, photocopier type or settings, fax machine type or settings.
The Line items capture process generally comes into play after the main parts of an invoice captured by Discovery. EzeScan will identify a grid of text (i.e. a table) which contains values and capture them.
The capture process can be a simple capture of the line items and export the data out to a CSV or XML file for use by other systems or a detailed capture where the data is checked for accuracy (totals, tax, etc). This document will cover both ends of the spectrum.
How does it work?
Discovery Module
Define the fields that you want to capture using the existing EzeScan KFI Admin form like in the example below. The fields with the red arrows will use Discovery.
DISCOVERY options are applied to the relevant fields. i.e. various criteria may be applied, such as
Search zone (which part of the page to search - full; top ½, bottom ½etc)
Pre-Processing qualifiers
Search Terms
Skip content
Pre-validation
Validate words by (e.g. currency, date, custom etc)
A comprehensive set of sample criteria is available in the appendices.
When the EzeScan KFI job is run, data entry fields configured to use DISCOVERY will attempt to automatically discover the target words that are located on the form.
Words that are successfully located are displayed highlighted with a blue border and the target word value is automatically placed into the data entry field.
If the target word cannot be found, the value can be manually found by zooming the image and keying the value into the data entry field.
Discovery is an optional EzeScan module. Whilst the KFI indexer may be set-up without the Discovery module the end result will not deliver the efficiencies that Discovery does as the operator will be required to manually enter each of the values.
Contact EzeScan Support for a sample Line Items and Discovery configuration to begin your own set-up.
Line Items
Another field is added to the KFI which is configured to capture the line items of the document.
This document is primarily designed to cover the "line items" components. Please refer to the EzeScan KFI User Guide for the other components.
To activate the Line items function for the field; select the Grid option on the Format tab. This will add the Grid Settings tab to allow the operator to configure the respective Line Items settings.
Important Note about Line Items
Line items functionality is only possible if the data being captured has a header; or a heading above the grid of text being captured. The easiest way to understand is to look at the two examples below. The one on the left has a header (heading) which provides a clear understanding of what is in each column; whereas the one on the right has no header and therefore there is no way of understanding what the columns are. If a supplier is sending invoices without headers then contact them to have their invoice templates changed to contain headers.
If a grid goes over more than one page the additional pages must also have a header too.
Are you licensed to run EzeScan Line Items?
First you'll need to check whether you are licensed to run the Line items option.
You will need the following EzeScan modules:
EzeScan PRO
EzeScan KFI module
EzeScan Discovery module (called OCR-Advanced)
Use the EzeScan Admin Licensing menu option to display the following form:
If the Licensing Options say either "EzeScan PRO All (Eval Only)" or contain the words "KFI" and "OCR(ADVANCED)" then you may run the KFI and Discovery (Line items) option. If your current production license is not licensed for "KFI" and "OCR(ADVANCED)" but you would like to evaluate the functionality please contact your reseller or send an email to sales@ezescan.com requesting a 30 day evaluation license with "KFI" and "OCR(ADVANCED)" enabled. The KFI Options must include LINEITEMS.
Setting up a simple line item job
This simple line items job is available from the EzeScan website. It may be used as an example to practice on. Contact EzeScan Support for details
Job set-up
For detailed steps see Basic/simple Line Items job
Create an EzeScan job which will enable the scanning or importing of invoices.
It has been called Line Items Basic Demo for the purposes of this document.
There is nothing special about the job, just the standard set-up
Create a KFI of the same name
Create an Upload - only if the scanned/imported documents are being uploaded into an EDRMS or similar system.
This sample job does not have an upload configured and is therefore not documented.
KFI set-up
Please refer to the EzeScan KFI User Guide in regards to setting up new KFI's.
The components in this section are provided as a guide only. The fields which are created will need to match your organisation's needs and may differ.
Create the required fields
Field | Field Name | Default Values | Mandatory | Capture Method |
1 | Tax Invoice | No | Discovery | |
2 | ABN | No | Discovery | |
3 | Supplier (Search or Lookup Using ABN or Order) | Yes | ODBC from ABN field | |
4 | Order No | No | Discovery | |
5 | Invoice Date | No | Discovery | |
6 | Invoice Number | Yes | Discovery | |
7 | Gross Total | Yes | Discovery | |
8 | Tax Total | No | Discovery | |
9 | Net Total | No | Discovery | |
10 | Line Items | No | Format tab… Grid |
When setting up the output file (in the KFI Output tab) you may wish to remove the "Line Items" column from the output if you are creating a separate Grid output file.
Creating the Grid Line Items Component
Using the above example; Field # 10 - select the Grid option on the Format tab:
A new Grid Settings tab will appear - Click Grid Settings button to open the Grid Advanced Settings below. Settings shown here are the default settings.
Faster Setup Hints
The hints below tell how to quickly setting up a grid field to easily handle different types of input documents (invoices) from multiple suppliers.
Before you dive in and use the Input Documents tab to manually add each of your supplier input documents formats (i.e. supplier invoices), There is a much easier way to do this by using the production interface grid processing form to do it. You need to enable this feature by ticking the ‘Create Input Document Templates’ option on the Processing tab.
Next, on the Output Data tab, Add a new output called ‘Common Data Output’. In the fields list on the Left hand side , add the common fields that appear on most of the supplier line item grids that you want to be included in the output data (e.g. Quantity, Code, Description, Unit Cost, Total Amount).
Set the fields format types as:
Name | Format |
---|---|
Quantity | Decimal |
Code | Any |
Description | Any |
Unit Cost | Currency |
Total | Currency |
On the View tab select only these options below:
On the Grid Location tab choose these settings below:
If you perform steps 1-4 first before trying to add new input documents (i.e. supplier invoice grids) you will find it much easier to use the Create New Template option on the Grid Processing form, when a brand new supplier’s invoices need to be added to the system.
5. When defining a new input document on the Grid Processing form, make sure the orange grid and column lines shown around and on the grid area line up with the actual grid data rows and columns.
You should move the orange column and row lines to align as close as possible to the actual grid lines. Pick up a line and drag it.
If the grid orange lines are all messed up, First look at name of the input document it is using. If it's the wrong input document name, select the right one from the list of input documents, or start to create a new input document for this supplier.
7. To Create a new input document choose (create new) and type in the supplier name (e.g. Runners R US Pty Ltd)
Then press the Clear button to remove the previous grid lines (the orange and blue lines disappear)
Then press the Define button
Then draw a green rectangle across the grid on the image. A new grid with orange and blue lines should appear on the image.
Take some time to align the orange and blue lines over the matching lines on the grid image. They don't have to perfectly align, but it looks better if you make an effort to visually align them close together.
Use the zoom in /out button to lines easier to see in the image viewer.
Make any corrections need to the Input data column headings (sometimes OCR makes small mistakes) . Simply Edit the header column names to match what is written on the supplier invoice column names.
On the lower ‘Output To’ line, drag the output data fields left or right to place them under the corresponding input document field. For fields that exist in the input document, but don’t have a corresponding field in the output document choose (Ignore) or leave them unmapped.
Once you have chosen (create new), entered the new Input Document name, cleared any old grid, defined the new grid area, aligned the grid row and column lines, checked spelling of input data column headings, mapped output to input columns correctly,. its now time to press the ‘Create’ button on the right hand side of the form.
You just created the input document for the first supplier added to the system.
You should now be able to process invoices for that supplier with the grid being found and input column data being mapped across into the correct output column data.
After verifying the data press the ‘Submit’ button to accept the the mapped grid data as being correct.
The sections below go into much greater detail about all of the individual settings available when using grid processing.
Grid Advanced Settings
The advanced settings provide the ability to create sets of templates which may be useful when processing invoices from the same supplier.
Setting up or changing the Grid Advanced Settings requires an amount of knowledge in the processes involved with creating/changing these settings. It is strongly advised you seek assistance in this area if you are unfamiliar with the Grid Advanced Settings. Please contact your EzeScan representative or EzeScan support for advice.
Output tab
Column Delimiter | The value used to separate the output column values of each row
|
Row Delimiter | The value used to separate each row of output values
|
Output Headers | Off by default
|
Grid to Separate Output File |
|
Filename |
You must include the file type (extension); i.e. .CSV or .XML |
Separate Output Folder | Blank by default
|
Rows On New Line | Off by default
|
Remove Output Row Delimiters | Off by default
|
Quote Output | Off by default
|
View tab
Data Grid Font Scale (% ) | Default is 133 Sets the scale of the text used for the grid results |
Maximum Column Characters (approx) | Default is 0 Restricts data columns to a width that will display approximately this many characters. Set to 0 for no restriction on column width |
Wrap Long Column Text | On by default Set to wrap the text on columns when not all the text fits into the available space |
Show Column Totals | On by default Show/Hide a Total row in the Output grid. To display a calculated total of line amounts, set Validation Format Type = Currency on the field in Output Data tab |
Show Column Check Totals | On by default Show/Hide a Compare to row in the Output grid Shows blank unless one or more of the Output currency fields has a field placeholder eg. <<F6>> defined in Displayed Checked Value |
Show Difference Between Totals | On by default Show/Hide a Difference row in the Output grid Shows blank unless one or more of the Output currency fields has a field placeholder eg. <<F6>> defined in Displayed Checked Value Values are highlighted in orange if not 0.00 |
Show Output to Columns | On by default Shows blank unless one of the Output currency fields has a field placeholder eg. <<F6>> defined in Displayed Check Value Off hides the Output To column display in Output grid. Column order is set in Output Data Tab |
Highlight High Confidence OCR | Off by default High confidence OCR values highlight in green in the Input image display. Useful for highlighting found grid result characters when testing |
Input Document templates can be configured to define how columns are processed for specific documents that match the template headers. Output Data templates define how data will be output. Input Documents can have their output set to use an Output Data template.
When configuring new templates process in the tab order shown
Processing tab
The Processing tab is used to define general settings for how Grids will be processed, including general data formatting.
Processing tab Fields
Min Page Count | Set to 1 by default |
Max Page Count | Set to 0 (zero) by default
|
Auto Processing | Off by default |
Perform Auto-Enhancement | On by default |
Header Keywords | When Performa Perform Auto-Enhancement is selected the pre-defined
|
Allow Ignoring Validation Errors | On by default |
Create Input Document Templates | Off by default
|
Default Input Document Name | Allows the provision of a default name to use for new input documents during processing
|
Default Output Map | Set to (as input) by default
|
Default Filter Tag | Blank by default
|
Template Filter | Set to (none) by default
|
Highlight High Confidence OCR | Off by default
|
Clean up Integer Columns | On by default
|
Integer Clean up Regex | Uses regex Replace - With pairs; for example…
|
Clean up Decimal Columns | On by default
|
Decimal Clean up Regex | Uses regex Replace - With pairs; for example…
|
Clean up Currency Columns | On by default
|
Currency Clean up Regex | Uses regex Replace - With pairs; for example…
|
Clean up Date Columns | On by default
|
Date Clean up Regex | Uses regex Replace - With pairs.
|
Grid Location tab
Used to define the base settings for how a grid is located, these apply to all grid matching.
Grid Location tab Fields
Grid Bottom Whitespace | Default setting is 2.2
|
Search Entire Page | On by default |
Auto Merge Orphan Rows | Off by default
|
Output Data tab
Enabled only if Create Input Document Templates is enabled in the Processing tab, the Output Data tab is used to define output templates that are mapped into by the Input Documents templates. Generally, only one output template is used for all Input Documents but more can be defined. Each Input Document has an Output Document assigned to it.The default Output Template to be used is set in the Processing tab. Output templates can be used to set up the output fields required from Line Items as well as to validate data from the grid. They can be populated from grid columns found on the page and from alternate sources such as KFI field values, EzeScan lookups and searches, ODBC lookups and searches, and EDRMS plug ins.
Output Data tab Fields - Top Section
Name | The name given to the Output Data map | ||
Description | Add a short description for the Output Data map. For example - Invoices | ||
Add Button | Will create a new Output Data map | ||
Delete Button | Will delete the selected Output Data map | ||
Previous Document button | Will navigate to the previous Output Data map | ||
Select a Document from the Pulldown List | Provides access to other Output Data maps (if available) | ||
Next Document button | Will navigate to the next Output Data map |
Output Data tab Fields - Columns Table Section
Name (heading) |
| |
Add a new Column | Will add a new column name to the table
| |
Remove a Column | Will remove a column name from the table | |
Move a Column Up | Will promote a column name upward in the table | |
Move a Column Down | Will demote a column name downward in the table |
Output Data tab Fields - Column Settings Section
Each column name created in the Columns table has its own set of configuration rules.
As a new column is added a blank settings screen will appear. This provides the ability to fine tune the data being exported; for example:
Removing certain text using regex values (e.g. removing the $ symbol from currency)
Defining the format of the data being exported (e.g. dates, currency - has inbuilt regexes)
Description | Each column can be provided with a description |
Read Only | Off by default |
Allow Mapping | On by default |
Alternate Source | Set to (none) by default |
(none)
Static Value The Static Value field appears and can be set using text and/or by using placeholders including calculations
Will allow a default or calculated value to be set for this column
Placeholder variables can be used
KFI fields should be referenced as their field number e.g. Field 1 is <<F1>>
Query Result A SQL query can be made against an ODBC source, the first result is returned as the field value.
Displays the ODBC settings and query options to return a single value from a query.
ODBC DSN - Click on the browse … button to select the appropriate DSN
ODBC User - Add a valid User ID for the selected DSN
ODBC Password - Add the User's password for the selected DSN
Use Lookups - If selected it will ignore the ODBC settings and allow EzeScan Lookups to perform this query
SQL Query - create and paste the SQL query to be used into this field
Will only function if a valid DSN or Lookup table have been created
Only the first result will be returned to the field
The query can contain placeholder values including grid output placeholders
Will require an ODBC DSN to be configured on the PC which is being used to process the job. Please refer to the EzeScan Pro User Guide in regards to setting up an ODBC DSN.
List lookup A SQL query can be used against an ODBC source, a selectable list of results is returned
Displays ODBC settings and query options to return a selectable list of values
The operator can only select an item from the list returned by the query, they cannot enter any other values
Provides the same configurable fields as the Query Result option except for this additional field:
Default Value - allows a default value to be used for this column, this can be a placeholder variable including those from grid output
Search Form A pop up search page can be used against an ODBC source to return a result
Displays the ODBC setting and query options to display a search form to the operator during processing.
The SQL Query must contain a table name that the search form will be set up for
Provides the same configurable fields as the Query Result option except for these additional fields:
SQL Query (Database Table) - The database table to be used for the search
Search Settings - Unset by default but allows the search to be configured per the KFI ODBC Search form (see KFI user guide)
Default Value - allows a default value to be used for this column
Search Alternate EDRMS
A pop up search page based on the selected EDRMS can be used to return a result
Requires the selected EDRMS EzeScan plugin
Provides the functionality to use a secondary data source for the cell value.
The Alternative EDRMS functionality is for "plugins" which have been developed to be used in the EzeScan plugin architecture.
Selecting this option will activate the following fields
Default Value - allows a default value to be used for this column
Alternate EDRMS - select the respective EDRMS to be used The Alternate EDRMS plugin information will then display at the bottom of the Output Data configuration, see the specific guides for the EDRMS for more information Use Replace RegexesOff by default If selected it will allow the use of Regexes to clean the cell value
A valid regex must be added to the field below this one|
Replace Regexes
Regexes to use when cleaning the cell value
Will only function if above field is selected
Click on the browse … button to launch the standard EzeScan Regex Window to allow the creation of a Regex to be used
You need to have an understanding of how to create and use Regex's to use this function.|
Hide
Off by default
If selected it will hide this column during processingValidation…
Format TypeValidation format to use for this column. The selected Format Type will also determine the Clean-up Regex used on the column.
There is a built in Format Regex for each Format Type, they are not editable but can be copied and applied to Custom where they can then be edited
Select from…
Default is Any - no validation is performed.
Currency
Custom – allows entry of custom regex
Date
Decimal
Integer|
Validation…
Format RegexA Regex code will be added depending on the Format Type selected in the field above.
Each option has a predefined Regex which cannot be altered, except for the Custom option.
Currency
CODE^\(?(\[$£\] *)?\d+(,\d\d\d)*\.\d\d\)$|^\(?(\[$£\] *)?\d+(\.\d\d\d)*,\d\d\)$|^\[+-\]?(\[$£\] *)?\d+(,\d\d\d)*\.\d\d$|^\[+-\]?(\[$£\] *)?\d+(\.\d\d\d)*,\d\d$
Custom
Develop you own custom Regex validation and paste into this field
Date
CODE^(\[12\]\d|3\[01\]|0\[1-9\])(1\[012\]|0\[1-9\])(\d\{4\}|'?\d\{2\})$|^(1\[012\]|0\[1-9\])(\[12\]\d|3\[01\]|0\[1-9\])(\d\{4\}|'?\d\{2\})$|^(\[12\]\d|3\[01\]|0?\[1-9\]) ?(st|nd|rd|th)?( ?\[-/\\\. \] ?)(1\[012\]|0?\[1-9\]|Jan\[a-z\]*|Feb\[a-z\]*|Mar\[a-z\]*|Apr\[a-z\]*|May|June?|July?|Aug\[a-z\]*|Sep\[a-z\]*|Oct\[a-z\]*|Nov\[a-z\]*|Dec\[a-z\]*)( ?\[-/\\\., \] ?)(\d\{4\}|'?\d\{2\})$|^(1\[012\]|0?\[1-9\]|Jan\[a-z\]*|Feb\[a-z\]*|Mar\[a-z\]*|Apr\[a-z\]*|May|June?|July?|Aug\[a-z\]*|Sep\[a-z\]*|Oct\[a-z\]*|Nov\[a-z\]*|Dec\[a-z\]*)( ?\[-/\\\., \] ?)(\[12\]\d|3\[01\]|0?\[1-9\]),? ?(st|nd|rd|th)?( ?\[-/\\\. \] ?)(\d\{4\}|'?\d\{2\})$
Decimal
CODE^\[+-\]?\d+(,\d\d\d)*(\.\d+)?$|^\[+-\]?\d+(\.\d\d\d)*(,\d+)?$
Integer
^-?\d+$|Validation…
Total Validation TypeSet to None by default
Will allow the selection of a validation type to be used when checking the Summed Total of this column
There are 3 different options which may be selected:
Static Value
Compare value to a static value set using text and/or by using placeholders including calculations
Lookup Query
Compare value to the result of a lookup ?
Database Query
Compare value to the result of a SQL query
These 3 different options are documented below|
Validation…
Total Validation ValueStatic Value is selected
This option can be used to validate against a static or calculated value using placeholders.
The summed total of this column must match this value.
Total Validation Value - Click on the browse … button to launch the edit window:
Placeholders can be selected from the dropdown at the top right, these are from the Output Data map and the KFI
The calculation can be displayed in three formats
Simple Calculation
Uses full placeholder name and basic formatting
Example: <<(KFI) Net>>+<<(KFI) Tax>>
Regular option
Uses contained notation and can allow for additional formatting
Example: <<="(KFI) Net"+"(KFI) Tax">>
Raw option
Uses basic placeholder references and contained notation
Example: <<=F6+F5>>
The cell value must match this value
Lookup Query is selected
The summed total of this column must match this value.
Total Validation Query - Click on the browse … button to launch the edit window:
Use one of these options to build the validation query:
Regular option
Raw option
The total value validation query must return one or more rows
Database Query is selected
This option can be used to validate the total against a value returned from a database query.
The summed total of this column must match this value.
Also displays the ODBC settings and query options lower down in the list. ODBC DSN - Click the browse … button to select the appropriate DSNODBC User - Add a valid User ID for the selected DSNODBC Password - Add the User's password for the selected DSN
Will require an ODBC DSN to be configured on the PC which is being used to process the job. Please refer to the EzeScan Pro User Guide in regards to setting up an ODBC DSN.
Total Validation Query - Click the browse … button to launch the edit window below:
Use one of these options to build the validation query:
Regular option
Raw option
The total value validation query must return one or more rows| Input Documents tabThe Input Documents tab is used to set templates for grids. Templates can be mapped to Output Data maps, and can be configured with various formatting and validation options. They can be modified after initial creation to improve the grid matching and data capture.The settings are enabled if Create Input Document Templates is selected in the Processing tab. Input Documents tab Fields - Top SectionAdd ButtonWill create a new Input DocumentDelete ButtonWill delete the selected Input DocumentPrevious Document buttonWill navigate to the previous Input DocumentSelect a Document from the Pulldown ListProvides access to other Input Document set-ups (if available)Next Document buttonWill navigate to the next Input DocumentNameThe unique name given to the Input DocumentRemember Column LocationsOn by default This will remember the user specified locations of the column lines
This is not recommended for documents with variable column sizes (i.e. untick the box)| | |
Description
Add a short description for the Input Document
e.g. Suppliers name and invoice type| | |
Filter Tag
This is only used if Template Filter by Tag has been selected in the Processing tab
The Input Template is only used for matching a grid if the Filter Tag matches the Default Filter Tag in the Processing tab or is blank.
The Default Filter Tag in the Processing tab can be a placeholder value so the Input Document templates used in matching can change for each image processed| | |
Output Document
Defaulted to (as input)
Sets the Output Data map that the Input Document columns can be mapped to
Populated from the Default Output Map in the Processing tab| | |
Generate button
Will generate an Output Document with matching columns
Will default to the same name which was added in the "Name" field| | |
Grid Bottom Pattern
Uses a regex to match the row after the last required grid row
Example: An invoice grid has a standard text line at the bottom of each grid that starts with the text *** Please pay by* that is always included in the grid match.
A regex can be used to match that line and end the grid there.
Grid Bottom Pattern regex: ^*+ Please
This is a case sensitive field| | |
Ignore Row Pattern
Requires a Regex value which will exclude the entire row from being displayed or output if the row text matches the value specified in the Regex
Example: Some large invoices have department headers in the form *** Department Name *** separating line item sections
Ignore Row Pattern regex:
CODE*^\*+ \[A-Z\]*
| | |
Manual Entry Data
*Off by default*
This will allow data to be manually entered rather than populated form the grid.
The Grid still has to be matched using the header row but the data will be empty and the additional setting can specific the number of blank rows to allow rather than showing the number found in the image
If selected it will activate the column settings section
This can be used to specify that the data portion of a grid will be manually entered
Good for poor quality scans or handwritten data| | |
Max Rows
Set to 0 (zero) by default
Is not available unless the "Manual Data Entry" option has been selected .
Leaving the field set to zero will disregard the total number of rows being processed| | |
Remember Row Locations
*On by default*
Is not available unless the "Manual Data Entry" option has been selected .Wil remember the user specified locations of the row lines
Leaving the field set to zero will disregard the total number of rows being processed
This is not recommended for documents with variable row sizes (i.e. untick the box)| | |Input Documents tab - Columns Table Section
Name (heading)
This is the value of the column header and is also used in determining a grid match.
It is important that this value is the expected column heading value in the image
If the initial creation of the grid had OCR errors, they should be corrected in the Input Document template
A new value will be added whenever the "Add New Column" button is selected.| |
Output Column (heading)
This is the mapping to the Output Data map column.
If the column is not required for output set it to (ignore).
The next unmapped Output Data map column name will be added whenever the "Add New Column" button is selected.| |
Add a new Column
Will add a new column name to the table
Create a name for the column which matches the output requirements|
Remove a Column
Will remove a column name from the table
Move a Column Up
Will promote a column name upward in the table
Move a Column Down
Will demote a column name downward in the table
Input Documents tab Fields - Column Settings SectionEach column name created in the Columns table has its own set of configuration rules.
As a new column is added a blank settings screen will appear. This provides the ability to fine tune the data being exported; for example:Removing certain text using regex values (e.g. removing the $ symbol from currency)
Defining the format of the data being exported (e.g. dates, currency - has inbuilt regexes)
Description
Each column can be provided with a description
Use Replace Regexes
*Off by default*
If selected it will allow the use of Regexes on the cell valueA valid regex must be added to the field below this one|
Replace Regexes
Regexes to use when cleaning the cell value
Will only function if above field is selected
Click the browse … button to launch the standard EzeScan Regex Window to allow the creation of a Regex to be used Example: Remove unwanted spaces and characters from a column
Regex:
CODE*"\[ /\]"{*}
,""
You need to have an understanding of how to create and use Regex's to use this function.|
Validation…Format Type
Validation format to use for this column.
This will also apply the Clean-up Regex from the Processing tab
Select from…
Default is Any
Currency
Custom – allows a regex to be entered
Date
Decimal
Integer|
Validation…Format Regex
A Regex code will be added depending on the Format Type selected in the field above.
Each option has a predefined Regex which cannot be altered, except for the Custom option.
Currency
CODE^\(?(\[$£\] *)?\d+(,\d\d\d)*\.\d\d\)$|^\(?(\[$£\] *)?\d+(\.\d\d\d)*,\d\d\)$|^\[+-\]?(\[$£\] *)?\d+(,\d\d\d)*\.\d\d$|^\[+-\]?(\[$£\] *)?\d+(\.\d\d\d)*,\d\d$
Custom
Develop you own custom Regex validation and paste into this field
Date
CODE^(\[12\]\d|3\[01\]|0\[1-9\])(1\[012\]|0\[1-9\])(\d\{4\}|'?\d\{2\})$|^(1\[012\]|0\[1-9\])(\[12\]\d|3\[01\]|0\[1-9\])(\d\{4\}|'?\d\{2\})$|^(\[12\]\d|3\[01\]|0?\[1-9\]) ?(st|nd|rd|th)?( ?\[-/\\\. \] ?)(1\[012\]|0?\[1-9\]|Jan\[a-z\]*|Feb\[a-z\]*|Mar\[a-z\]*|Apr\[a-z\]*|May|June?|July?|Aug\[a-z\]*|Sep\[a-z\]*|Oct\[a-z\]*|Nov\[a-z\]*|Dec\[a-z\]*)( ?\[-/\\\., \] ?)(\d\{4\}|'?\d\{2\})$|^(1\[012\]|0?\[1-9\]|Jan\[a-z\]*|Feb\[a-z\]*|Mar\[a-z\]*|Apr\[a-z\]*|May|June?|July?|Aug\[a-z\]*|Sep\[a-z\]*|Oct\[a-z\]*|Nov\[a-z\]*|Dec\[a-z\]*)( ?\[-/\\\., \] ?)(\[12\]\d|3\[01\]|0?\[1-9\]),? ?(st|nd|rd|th)?( ?\[-/\\\. \] ?)(\d\{4\}|'?\d\{2\})$
Decimal
CODE^\[+-\]?\d+(,\d\d\d)*(\.\d+)?$|^\[+-\]?\d+(\.\d\d\d)*(,\d+)?$
Integer
^-?\d+$|Validation…Value Validation Type
Set to None by default
There is one option which may be selected:
Static Value
Will allow the selection of a validation type to be used when checking the cell value|
Validation… Value Validation Value
Static Value is selected
This option can be used to validate against a static or calculated value using placeholders.
The cell value must match this value.
Value Validation Value - Click the browse … button to launch the edit window:
Placeholders can be selected from the dropdown at the top right, these are from the Output Data map and the KFI
The calculation can be displayed in three formats
Simple Calculation
Uses full placeholder name and basic formatting
Example: <<Quantity>><<Unit Price>>*
Regular option
Uses contained notation and can allow for additional formatting
Example: <<="Quantity""Unit Price">>*
Raw option
Uses basic placeholder references and contained notation
Example: <<=GO4*GO5>>
The cell value must match this value|
Validation… Grid Bottom Pattern
Uses a regex to match the value in the cell after the last required grid row
Example: An invoice grid has the word Total in the cell after the last required grid value that is always included in the grid match.
A regex can be used to match that cell and end the grid there.
Grid Bottom Pattern regex: Total
This is a case sensitive field|
Validation… Ignore Row Pattern
Requires a Regex value which will exclude the entire row from being displayed or output if the cell text matches the value specified in the Regex
Example: An invoice has extra delivery information in the description column that always starts with Deliver to
Ignore Row Pattern regex: Deliver to
Appendices
Examples of Commonly identified issues
Output files
Why does my line items output CSV file have tilde's (~) at end of each line?
It is more than likely because the Grid settings tab need to be set like the settings shown below:
If the "Rows on new line" box is ticked then the tilde will be appended to the values
Ticking the "Remove output row delimiters" box will remove the tilde in the output file.
Examples of Placeholders used by Advanced Settings
Field Placeholders
Field placeholders should be referenced by their field number in the KFI
E.g. Field 1 is <<F1>>
Grid Placeholders
Placeholder | Details |
<<GIx>> | Grid Input fields are referenced as their column number corresponding to the Input Document template.
|
<<GOx>> | Grid Output fields are referenced as their column number corresponding to the Output Document template. They are calculated on output. |
<<GL>> | Grid Line returns the current grid line number
|
Examples of Regexes used in the Invoice "Discovery" process
The examples provided below are exactly that - they are examples and should work in most cases but may require fine tuning or totally reworked to suit your organisations requirements. If you are experiencing issues with any of the regex examples please contact support@ezescan.com.au for assistance.
Tax Invoice
Select Tab | Use this… |
---|---|
Search Zone Size | Expand to Top 1/3 of Page |
Pre-processing | None |
Search settings… Content Advanced… Use Find Regex | Tax(ation)?\s?(\n)?\s?Invoice|Invoice |
Skip Content | None |
Pre-validation… Use Output Replace Regex | Replace - Tax.*Invoice","Invoice","^$ With - Check that document is an Invoice |
Validate words by | Ignore |
ABN
Profile #1 - ABN Content Search Adv
Select Tab | Use this… |
---|---|
Search Zone Size | Expand to Whole Page |
Pre-processing | None |
Search settings → Content Advanced → Use Find Regex | (?<=^|\s|[.,:; ])\d[ \-\.]?\d[ \-\.]?\d[ \-\.]?\d[ \-\.]?\d[ \-\.]?\d[ \-\.]?\d[ \-\.]?\d[ \-\.]?\d[ \-\.]?\d[ \-\.]?\d(?=\s|\\n|$) |
Skip Content → Use Skip Regex | ((tel(ephone)?|fax|phone|facsimilie|office|direct)[.,;; ]*|\+) ?\d+ |
Pre-validation → Use Output Replace Regex | "-","","\s","","(\d{2})(\d{3})(\d{3})(\d{3})","$1 $2 $3 $4" |
Validate words by | Australian Business Number (ABN) or None if using dummy invoices |
Order Number
Profile #1 - Content Search
Select Tab | Use this… | ||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Search Zone Size | Expand to Whole Page | ||||||||||||||||||||||||
Pre-processing | None | ||||||||||||||||||||||||
Search settings… Content Advanced… Use Find Regex | (?<=^ | \s)((your | cust(omer)? | purchase | job | order).? *)?(order | ref(erence)? | job | agreement | p[.]?(o | w[.]?(o | (0 | o)/No | work ((ref(erence))?.? *(number | nbr | no | #))) ?((ref(erence))?.? *(num(ber)? | nbr | no | #))? ?[.,:; ]{0,5} *([a-z,0-9/-])?\d{3}([0-9/-])?(?=[.,:;/] | \s | n | $) | ||
Skip Content | None | ||||||||||||||||||||||||
Pre-validation… Use Output Replace Regex | "(?<=^ | \s)((your | cust(omer)? | purchase | job | order).? *)?(order | ref(erence)? | job | agreement | p[.]?(o | w[.]?(o | (0 | o)/No | work ((ref(erence))?.? *(number | nbr | no | #))) ?((ref(erence))?.? *(number | nbr | no | #))? ?[.,:; ]{1,5} *","" | |||||
Validate words by | Custom… |
Profile #2 - Term Search Right
Select Tab | Use this… |
---|---|
Search Zone Size | Expand to Whole Page |
Pre-processing | None |
Search Settings → Search Terms → Right of search term | order number:,order number,order no.:,order no:,order no.,order no,order #:,order #,order reference:,order reference,order ref.:,order ref.,order ref:,order ref,your order number,your order number:,your order no.:,your order no.,your order no:,your order no,your order reference,your order reference:,your order ref.:,your order ref.,your order ref:,your order ref,customer order number:,customer order number,customer order no.:,customer order no.,customer order no,customer order no:,customer order reference:,customer order reference,customer order ref.:,customer order ref:,customer order ref.,customer order ref,agreement no.:,agreement no.,agreement no:,agreement no,purchase order:,purchase order,your ref.:,your ref.,your ref:,your ref,your o/no.:,your o/no:,your o/no.,your o/no,work order number,work order #,work order no.,work order no,work number,work no.,work no,work #,job number,job no.,job no,job#,your no.,your no,PO Number,*PO no:,PO No:,PO No,PO #:,PO #,P.O. Number,P.O. No.:,P.O. No.,P.O. No,P.O. #:,P.O. #,PO Number:,P.O. Number:,WO Number,WO No,WO #:,WO #,WO No:,W.O. Number,W.O. #,WO Number:,W.O. No.:,W.O. No.,W.O. No,W.O. Number:,W.O. #:Order No:.Order No,PO :,P.O
|
Skip Content…Skip strings | po box number,po box no.,po box no,po box #,po box,description,item |
Pre-validation | None |
Validate words by | Ignore |
Profile #3 - Term Search Below
Select Tab | Use this… |
---|---|
Search Zone Size | Expand to Whole Page |
Pre-processing | None |
Search Settings…Search Terms…Right of search term | order number:,order number,order no.:,order no:,order no.,order no,order #:,order #,order reference:,order reference,order ref.:,order ref.,order ref:,order ref,your order number,your order number:,your order no.:,your order no.,your order no:,your order no,your order reference,your order reference:,your order ref.:,your order ref.,your order ref:,your order ref,customer order number:,customer order number,customer order no.:,customer order no.,customer order no,customer order no:,customer order reference:,customer order reference,customer order ref.:,customer order ref:,customer order ref.,customer order ref,agreement no.:,agreement no.,agreement no:,agreement no,purchase order:,purchase order,your ref.:,your ref.,your ref:,your ref,your o/no.:,your o/no:,your o/no.,your o/no,work order number,work order #,work order no.,work order no,work number,work no.,work no,work #,job number,job no.,job no,job#,your no.,your no,PO Number,*PO no:,PO No:,PO No,PO #:,PO #,P.O. Number,P.O. No.:,P.O. No.,P.O. No,P.O. #:,P.O. #,PO Number:,P.O. Number:,WO Number,WO No,WO #:,WO #,WO No:,W.O. Number,W.O. #,WO Number:,W.O. No.:,W.O. No.,W.O. No,W.O. Number:,W.O. #:Order No:.Order No
|
Skip Content…Skip strings | po box number,po box no.,po box no,po box #,po box,description,item |
Pre-validation | None |
Validate words by | Ignore |
Invoice Date
Profile #1 - Content Search - Date with Months Labelled Same Separators
Select Tab | Use this… |
---|---|
Search Zone Size | Expand to top ½ of page |
Pre-processing | None |
Search settings… Content Advanced… Use Find Regex | (?<=^|\s|\n)(Invoice|Due|Received|Pay(ment)?) ?(Date)? ?(Due|Received)? ?(By)? ?( ?[;:.,\- ] ?)?(([12]\d|3[01]|0?[1-9]) ?(st|nd|rd|th)?( ?([\-/\\\. ,]) ?)(Jan[a-z]|Fe(b|v)[a-z]|Mar[a-z]|A(p|v)r[a-z]|Ma(i|y)|Ju(ne?|in?)|Jui?ly?|A(out?|ug)[a-z]|Sep[a-z]|Oct[a-z]|Nov[a-z]|Dec[a-z])(\11)(\d{4}|'?\d{2})|(Jan[a-z]|Fe(b|v)[a-z]|Mar[a-z]|A(p|v)r[a-z]|Ma(i|y)|Ju(ne?|in?)|Jui?ly?|A(out?|ug)[a-z]|Sep[a-z]|Oct[a-z]|Nov[a-z]|Dec[a-z])( ?([\-/\\\. ]),? ?)([12]\d|3[01]|0?[1-9]),? ?(st|nd|rd|th)?(\27)(\d{4}|'?\d{2}))(?=T|[,]|\s|\n|$) |
Skip Content… Use Skip Regex | (Due|Received|Pay)[.,;; ]* ? |
Pre-validation… Use Output Replace Regex | "(?<=^|\s|\n)(Invoice|Due|Received|Pay(ment)?)? ?(Date)? ?(Due|Received)? ?(By)? ?( ?[;:.,\- ] ?)?","","USDATE^(1[012]|0?[1-9])( ?([\-/\\\. ]),? ?)([12]\d|3[01]|0?[1-9])( ?([\-/\\\. ,]) ?)(\d{4}|'?\d{2})","$4-$1-$7","USDATE^(1[012]|0[1-9])([12]\d|3[01]|0[1-9])(\d{4}|'?\d{2})","$2-$1-$3","USDATE^((20\d{2})( ?([\-/\\\. ,]) ?)(\d{1,2})( ?([\-/\\\. ,]) ?)(\d{1,2}))","$8-$5-$2","(Jan|Mar|Sep|Oct|Nov|Dec)[a-z]","$1","Fe(b|v)[a-z]","Feb","A(out?|ug)[a-z]","Aug","A(p|v)r[a-z]","Apr","Ma(i|y)","May","Ju(ne?|in?)","Jun","Jui?ly?","Jul" |
Validate words by | Date |
Profile #2: Content Search - Date Numbers Only Labelled Same Separator
Select Tab | Use this… | ||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Search Zone Size | Expand to top ½ of page | ||||||||||||||||||||||||||||||||||||
Pre-processing… Use Input Replace Regex | "111111+","" | ||||||||||||||||||||||||||||||||||||
Search settings… Content Advanced… Use Find Regex x | (?<=^ | \s | \n)(Invoice | Due | Received | Pay(ment)?) ?(Date)? ?(Due | Received)? ?(By)? ?( ?[;:.,- ] ?)?((1[012] | 0[1-9])([12]\d | 3[01] | 0[1-9])(\d{4} | '?\d{2}) | ([12]\d | 3[01] | 0[1-9])(1[012] | 0[1-9])(\d{4} | '?\d{2}) | ([12]\d | 3[01] | 0?[1-9]) ?(st | nd | rd | th)?( ?([-/\\\. ,]) ?)(1[012] | 0?[1-9])(\17)(\d{4} | '?\d{2}) | (1[012] | 0?[1-9])( ?([-/\\\. ]),? ?)([12]\d | 3[01] | 0?[1-9]),? ?(st | nd | rd | th)?(\23)(\d{4} | '?\d{2}) | (20\d{2}( ?([-/\\\. ,]) ?)\d{1,2}(\31)\d{1,2}))(?=T | [,] | \s | \n | |
Skip Content… Use Skip Regex | (Due | Received | Pay)[.,;; ]* ? | ||||||||||||||||||||||||||||||||||
Pre-validation… Use Output Replace Regex | "(?<=^ | \s | \n)(Invoice | Due | Received | Pay(ment)?)? ?(Date)? ?(Due | Received)? ?(By)? ?( ?[;:.,- ] ?)?","","USDATE^(1[012] | 0?[1-9])( ?([-/\\\. ]),? ?)([12]\d | 3[01] | 0?[1-9])( ?([-/\\\. ,]) ?)(\d{4} | '?\d{2})","$4-$1-$7","USDATE^(1[012] | 0[1-9])([12]\d | 3[01] | 0[1-9])(\d{4} | '?\d{2})","$2-$1-$3","USDATE^((20\d{2})( ?([-/\\\. ,]) ?)(\d{1,2})( ?([-/\\\. ,]) ?)(\d{1,2}))","$8-$5-$2","(Jan | Mar | Sep | Oct | Nov | Dec)[a-z]*","$1","Fe(b | v)[a-z]*","Feb","A(out? | ug)[a-z]*","Aug","A(p | v)r[a-z]*","Apr","Ma(i | y)","May","Ju(ne? | in?)","Jun","Jui?ly?","Jul" | ||||||||||||
Validate words by | Date |
Profile #3: Content Search - Date with Months Same Separator
Select Tab | Use this… | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Search Zone Size | Expand to top ½ of page | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Pre-processing | None | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Search settings… Content Advanced… Use Find Regex | (?<=^ | \s | \n)(Invoice | Due | Received | Pay(ment)?)? ?(Date)? ?(Due | Received)? ?(By)? ?( ?[;:.,- ] ?)?(([12]\d | 3[01] | 0?[1-9]) ?(st | nd | rd | th)?( ?([-/\\\. ,]) ?)(Jan[a-z]* | Fe(b | v)[a-z]* | Mar[a-z]* | A(p | v)r[a-z]* | Ma(i | y) | Ju(ne? | in?) | Jui?ly? | A(out? | ug)[a-z]* | Sep[a-z]* | Oct[a-z]* | Nov[a-z]* | Dec[a-z]*)(\11)(\d{4} | '?\d{2}) | (Jan[a-z]* | Fe(b | v)[a-z]* | Mar[a-z]* | A(p | v)r[a-z]* | Ma(i | y) | Ju(ne? | in?) | Jui?ly? | A(out? | ug)[a-z]* | Sep[a-z]* | Oct[a-z]* | Nov[a-z]* | Dec[a-z]*)( ?([-/\\\. ]),? ?)([12]\d | 3[01] | 0?[1-9]),? ?(st | nd | rd | th)?(\27)(\d{4} | '?\d{2}))(?=T | [,] | \s | \n | $) | ]]></ac:plain-text-body></ac:structured-macro> |
Skip Content… Use Skip Regex | (Due | Received | Pay)[.,;; ]* ? | ]]></ac:plain-text-body></ac:structured-macro> | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Pre-validation… Use Output Replace Regex | "(?<=^ | \s | \n)(Invoice | Due | Received | Pay(ment)?)? ?(Date)? ?(Due | Received)? ?(By)? ?( ?[;:.,- ] ?)?","","USDATE^(1[012] | 0?[1-9])( ?([-/\\\. ]),? ?)([12]\d | 3[01] | 0?[1-9])( ?([-/\\\. ,]) ?)(\d{4} | '?\d{2})","$4-$1-$7","USDATE^(1[012] | 0[1-9])([12]\d | 3[01] | 0[1-9])(\d{4} | '?\d{2})","$2-$1-$3","USDATE^((20\d{2})( ?([-/\\\. ,]) ?)(\d{1,2})( ?([-/\\\. ,]) ?)(\d{1,2}))","$8-$5-$2","(Jan | Mar | Sep | Oct | Nov | Dec)[a-z]*","$1","Fe(b | v)[a-z]*","Feb","A(out? | ug)[a-z]*","Aug","A(p | v)r[a-z]*","Apr","Ma(i | y)","May","Ju(ne? | in?)","Jun","Jui?ly?","Jul" | ]]></ac:plain-text-body></ac:structured-macro> | |||||||||||||||||||||||||||||||
Validate words by | Date |
Profile #4: Content Search - Date Numbers Only Same Separator
Select Tab | Use this… | ||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Search Zone Size | Expand to top ½ of page | ||||||||||||||||||||||||||||||||||||||
Pre-processing… Use Input Replace Regex | "111111+","" | ||||||||||||||||||||||||||||||||||||||
Search settings… Content Advanced… Use Find Regex | (?<=^ | \s | \n)(Invoice | Due | Received | Pay(ment)?)? ?(Date)? ?(Due | Received)? ?(By)? ?( ?[;:.,- ] ?)?((1[012] | 0[1-9])([12]\d | 3[01] | 0[1-9])(\d{4} | '?\d{2}) | ([12]\d | 3[01] | 0[1-9])(1[012] | 0[1-9])(\d{4} | '?\d{2}) | ([12]\d | 3[01] | 0?[1-9]) ?(st | nd | rd | th)?( ?([-/\\\. ,]) ?)(1[012] | 0?[1-9])(\17)(\d{4} | '?\d{2}) | (1[012] | 0?[1-9])( ?([-/\\\. ]),? ?)([12]\d | 3[01] | 0?[1-9]),? ?(st | nd | rd | th)?(\23)(\d{4} | '?\d{2}) | (20\d{2}( ?([-/\\\. ,]) ?)\d{1,2}(\30)\d{1,2}))(?=T | [,] | \s | \n | $) | ]]></ac:plain-text-body></ac:structured-macro> | |
Skip Content… Use Skip Regex | (Due | Received | Pay)[.,;; ]* ? | ]]></ac:plain-text-body></ac:structured-macro> | |||||||||||||||||||||||||||||||||||
Pre-validation… Use Output Replace Regex | "(?<=^ | \s | \n)(Invoice | Due | Received | Pay(ment)?)? ?(Date)? ?(Due | Received)? ?(By)? ?( ?[;:.,- ] ?)?","","USDATE^(1[012] | 0?[1-9])( ?([-/\\\. ]),? ?)([12]\d | 3[01] | 0?[1-9])( ?([-/\\\. ,]) ?)(\d{4} | '?\d{2})","$4-$1-$7","USDATE^(1[012] | 0[1-9])([12]\d | 3[01] | 0[1-9])(\d{4} | '?\d{2})","$2-$1-$3","USDATE^((20\d{2})( ?([-/\\\. ,]) ?)(\d{1,2})( ?([-/\\\. ,]) ?)(\d{1,2}))","$8-$5-$2","(Jan | Mar | Sep | Oct | Nov | Dec)[a-z]*","$1","Fe(b | v)[a-z]*","Feb","A(out? | ug)[a-z]*","Aug","A(p | v)r[a-z]*","Apr","Ma(i | y)","May","Ju(ne? | in?)","Jun","Jui?ly?","Jul" | ]]></ac:plain-text-body></ac:structured-macro> | |||||||||||||
Validate words by | Date |
Invoice Number
Profile #1: Content Search
Select Tab | Use this… | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Search Zone Size | Expand to top ½ of page | |||||||||||||
Pre-processing | None | |||||||||||||
Search settings… Content Advanced… Use Find Regex | (?<=^ | \s)((inv(oice) | doc(ument)? | tax | transaction)(\.)? ?(n(a | o | br | umber) | #)? | (tax )?invoice) ?[\W]{0,4} *[A-Z0-9/-\.]\d[A-Z0-9/-\.]+(?=\s | \n | $) | ||
(cust(omer)? | order | contract | acc(ount)? | del(ivery)? | vat | gst | our | your).? *(reg(istration)? | ref(erence)? | note)? *(number | nbr | no | #) | |
Pre-validation… Use Output Replace Regex | "(?<=^ | \s)((inv(oice)? | doc(ument)? | tax).? *(number | nbr | no | #) | number | (tax )?invoice) ?(no)? ?((copy))? ?[:;\.-•, ]{0,4} *","","\.","" | |||||
Validate words by | Custom | [0-9]{3,}[A-Z]?)$ |
Profile #1a: Content Search - Alt Validation
Select Tab | Use this… | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Search Zone Size | Expand to top ½ of page | |||||||||||||
Pre-processing | None | |||||||||||||
Search settings… Content Advanced… Use Find Regex | (?<=^ | \s)((inv(oice) | doc(ument)? | tax | transaction)(\.)? ?(n(a | o | br | umber) | #)? | (tax )?invoice) ?[\W]{0,4} *[A-Z0-9/-\.]\d[A-Z0-9/-\.]+(?=\s | \n | $) | ||
Skip Content… Use Skip Regex | (cust(omer)? | order | contract | acc(ount)? | del(ivery)? | vat | gst | our | your).? *(reg(istration)? | ref(erence)? | note)? *(number | nbr | no | #) |
Pre-validation… Use Output Replace Regex | "(?<=^ | \s)((inv(oice)? | doc(ument)? | tax).? *(number | nbr | no | #) | number | (tax )?invoice) ?(no)? ?((copy))? ?[:;\.-•, ]{0,4} *","","\.","" | ] | ||||
Validate words by | Custom… | [0-9]{3,}[A-Z]?)$ |
Profile #2: Term Search
Select Tab | Use this… | |||
---|---|---|---|---|
Search Zone Size | Expand to top ½ of page | |||
Pre-processing | None | |||
Search Settings…Search Terms…Below search term | invoice number:,invoice number,invoice no:,invoice no.,invoice #:,invoice#:,invoice:,invoice #,Tax Invoice,tax invoice (copy),bill,number,invoice
| |||
Skip Content…Skip strings | PO number,P.O. number,order number,abn number | |||
Pre-validation | None | |||
Validate words by | Custom… | [0-9]{3,}[A-Z]?)$ |
Gross Total
Profile #1: Content Adv. Search
Select Tab | Use this… | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Expand to Bottom ½ of Page | |||||||||||||||||||||
Pre-processing… Use Input Replace Regex | "(?<=\s | \n | \$ | ^)(\d{1,3})[ \.](\d{3})[ \.](\d{3})[ \.](\d{3})[\,\.](\d{2})(?=\n | \s | $)","$1$2$3$4.$5","(?<=\s | \n | \$ | ^)(\d{1,3})[ \.](\d{3})[\.](\d{3})[\,\.](\d{2})(?=\n | \s | $)","$1$2$3.$4","(?<=\s | \n | \$ | ^)(\d{1,3})[ \.](\d{3})[\,\.](\d{2})(?=\n | \s | $)","$1$2.$3" | |||||
Search settings… Content Advanced… Use Find Regex | (?<=^ | \s | /)(tax included in)? ?((gross | inv(oice)? | total | ttl) ?(()? ?(amount | inc.?((lud(ing | es)) | lusive of)? ?([ghpq](\.)?s(\.)?t(\.)? | v(\.)?a(\.)?t(\.)?)[:\.\,; ]? | value | total | GST inc\.?(lu(sive | ded))? | total ?(this tax invoice)? | total invoice)? ?())?)( ?[^\w\s]{1,4})?( ?C ?(A ?(D | N))?)? ?(\n)?( ?\$)? ?([1-9]{1}\d{0,2}([ ,]?\d{3}){0,2} | \d{1})[\.,]\d{2}(?=\, | \s | \$ | $) |
Skip Content…Use Skip Regex | (sub *(amount | total) | total *(g(\.)?s(\.)?t(\.)? | v(\.)?a(\.)?t(\.)?) | (g(\.)?s(\.)?t(\.)? | v(\.)?a(\.)?t(\.)?) ?\w* ? (amount | total)) | amount due | |||||||||||||
Pre-validation… Use Output Replace Regex | "\s","","\n","","[\d]+$","","[a-z\s\W]*","",",(\d{2})$",".$1",",","" | ||||||||||||||||||||
Validate words by | Custom… | \d{0,2}(,\d{3}){1,3}) | \d)\.\d\d$ |
Profile #1a: Content Adv. Search - Without decimals
Select Tab | Use this… | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Search Zone Size | Expand to Bottom ½ of Page | ||||||||||||||||||||
Pre-processing… Use Input Replace Regex | "(?<=\s | \n | \$ | ^)(\d{1,3})[ \.](\d{3})[ \.](\d{3})[ \.](\d{3})[\,\.](\d{2})(?=\n | \s | $)","$1$2$3$4.$5","(?<=\s | \n | \$ | ^)(\d{1,3})[ \.](\d{3})[\.](\d{3})[\,\.](\d{2})(?=\n | \s | $)","$1$2$3.$4","(?<=\s | \n | \$ | ^)(\d{1,3})[ \.](\d{3})[\,\.](\d{2})(?=\n | \s | $)","$1$2.$3" | |||||
Search settings… Content Advanced… Use Find Regex | (?<=^ | \s | /)(tax included in)? ?((gross | inv(oice)? | total | ttl) ?(()? ?(amount | inc.?((lud(ing | es)) | lusive of)? ?([ghpq](\.)?s(\.)?t(\.)? | v(\.)?a(\.)?t(\.)?)[:\.\,; ]? | value | total | GST inc\.?(lu(sive | ded))? | total ?(this tax invoice)? | total invoice)? ?())?)( ?[^\w\s]{1,4})?( ?C ?(A ?(D | N))?)? ?(\n)?( ?\$)? ?([1-9]{1}\d{0,2}([ ,]?\d{3}){0,2} | \d{1})[\.,]?\d{2}(?=\, | \s | \$ | $) |
Skip Content…Use Skip Regex | (sub *(amount | total) | total *(g(\.)?s(\.)?t(\.)? | v(\.)?a(\.)?t(\.)?) | (g(\.)?s(\.)?t(\.)? | v(\.)?a(\.)?t(\.)?) ?\w* ? (amount | total)) | amount due | tax included in | ||||||||||||
Pre-validation… Use Output Replace Regex | "\s","","\n","","[\d]+$ ","","[a-z\s\W]*","",",(\d{2})$",".$1",",","" | ||||||||||||||||||||
Validate words by | Custom… ^\$?([1-9](\d+ | \d{0,2}(,\d{3}){1,3}) | \d)(\.\d\d)?$ |
Profile #2: Term Search Below
Select Tab | Use this… | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Search Zone Size | Expand to Whole Page | |||||||||||||||
Pre-processing… Use Input Replace Regex | "(?<=\s | \n | \$ | ^)(\d{1,3})[ \.](\d{3})[ \.](\d{3})[ \.](\d{3})[\,\.](\d{2})(?=\n | \s | $)","$1$2$3$4.$5","(?<=\s | \n | \$ | ^)(\d{1,3})[ \.](\d{3})[\.](\d{3})[\,\.](\d{2})(?=\n | \s | $)","$1$2$3.$4","(?<=\s | \n | \$ | ^)(\d{1,3})[ \.](\d{3})[\,\.](\d{2})(?=\n | \s | $)","$1$2.$3","Total Mount","Total Amount" |
Search Settings…Search Terms…Below search term | total new charges due,invoice total,gross total,total amount,total,amount
| |||||||||||||||
Skip Content…Skip strings | sub total,gst total,tax total,gst,ex gst,UST | |||||||||||||||
Pre-validation… Use Output Replace Regex | "\s","","\n","","[\d]+$","","[a-z\s\W]*","",",(\d{2})$",".$1",",","" | |||||||||||||||
Validate words by | Currency |
Profile #2a: Term Search Right
Select Tab | Use this… | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Search Zone Size | Expand to Whole Page | |||||||||||||||
Pre-processing… Use Input Replace Regex | "(?<=\s | \n | \$ | ^)(\d{1,3})[ \.](\d{3})[ \.](\d{3})[ \.](\d{3})[\,\.](\d{2})(?=\n | \s | $)","$1$2$3$4.$5","(?<=\s | \n | \$ | ^)(\d{1,3})[ \.](\d{3})[\.](\d{3})[\,\.](\d{2})(?=\n | \s | $)","$1$2$3.$4","(?<=\s | \n | \$ | ^)(\d{1,3})[ \.](\d{3})[\,\.](\d{2})(?=\n | \s | $)","$1$2.$3","Total Mount","Total Amount" |
Search settings…Search Terms…Right of search term | total new charges due,invoice total,gross total,total amount,total,amount
| |||||||||||||||
Skip Content…Skip strings | sub total,gst total,tax total,gst,ex gst,UST | |||||||||||||||
Pre-validation… Use Output Replace Regex | "\s","","\n","","[\d]+$","","[a-z\s\W]*","",",(\d{2})$",".$1",",","" | |||||||||||||||
Validate words by | Currency |
Invoice Items - Gross
Select Tab | Use this… |
---|---|
Search Zone Size | Expand to Bottom ½ of Page |
Pre-processing | None |
Search settings…Invoice Items | Gross Total |
Skip Content | None |
Pre-validation | None |
Validate words by | Ignore |
Profile #3: Word Position - Last Currency
Select Tab | Use this… | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Search Zone Size | Expand to Whole of Page | ||||||||||
Pre-processing… Use Input Replace Regex | "(?<=\s | \$ | ^)(\d{1,3})[ \.](\d{3})[ \.](\d{3})[ \.](\d{3})[\,\.](\d{2})(?=\s | $)","$1$2$3$4.$5","(?<=\s | \$ | ^)(\d{1,3})[ \.](\d{3})[\.](\d{3})[\,\.](\d{2})(?=\s | $)","$1$2$3.$4","(?<=\s | \$ | ^)(\d{1,3})[ \.](\d{3})[\,\.](\d{2})(?=\s | $)","$1$2.$3" | |
Search settings…Word Position | Word position Y= From bottom | ||||||||||
Skip Content…Use Skip Regex | (sub *(amount | total) | total *(g(\.)?s(\.)?t(\.)? | v(\.)?a(\.)?t(\.)?) | (g(\.)?s(\.)?t(\.)? | v(\.)?a(\.)?t(\.)?) ?\w* ? (amount | total)) | amount due | tax included in | gst included in new charges | |
Pre-validation… Use Output Replace Regex | "\s","","\n","","[\d]+$","","[a-z\s\W]*","",",(\d{2})$",".$1",",","" | ||||||||||
Validate words by | Ignore |
Tax Total
Profile #1: Content Adv. Search GST
Select Tab | Use this… | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Search Zone Size | Expand to Whole Page | ||||||||||
Pre-processing… Use Input Replace Regex | "(3ST","GST","G(\. | )?S(\. | )?T(\.)?","GST","goods and services tax","GST","SALES TAX","GST","Tax","GST","GST\s","GST " | ||||||||
Search settings… Content Advanced… Use Find Regex | (?<=^ | \s | /)((total)? ?(GST | goods (& | and) services tax) ?(included | amount | total)? ?(included)? ?(of | amount)?)( ?[-\.\,\:\;\s]{1,4})?( ?A(UD)?)? ?(\n)?( ?\$)? ?([1-9]{1}\d{0,2}(,?\d{3}){0,2} | \d{1})[\.,]\d{2}(?=\s | \n | $) |
Skip Content…Use Skip Regex | total (in | ex)c(lu(sive | ding))? (of)? ?([GHPQ](\.)?S(\.)?T(\.)? | tax) | (in | ex)c(lu(sive | ding))? (of)? ?([GHPQ](\.)?S(\.)?T(\.)? | tax) | EX Tax Total | ||
Pre-validation… Use Output Replace Regex | "\s","","\n","","[\d]+$","","[a-z\s\W]*","",",(\d{2})$",".$1",",","" | ||||||||||
Validate words by | Custom… | \d{0,2}(,\d{3}){1,3}) | \d)\.\d\d$ |
Profile #2: Invoice Item
Select Tab | Use this… |
---|---|
Search Zone Size | Expand to Whole Page |
Pre-processing | None |
Search settings… Invoice Items | Tax Total |
Skip Content | None |
Pre-validation | None |
Validate words by | Currency |
Profile #3: Term Search
Select Tab | Use this… | ||
---|---|---|---|
Search Zone Size | Expand to Whole Page | ||
Pre-processing… Use Input Replace Regex | None | ||
Search Settings…Search Terms…Below, Left & Right of search term | gst included in new charges,gst total,gst value,TOTAL GST PAYABLE,TOTAL GST 10%,total gst,total tax,tax total,gst,g.s.t.,SALES TAX
| ||
Skip Content…Skip strings | including gst,inc gst,excluding gst,ex gst,freight | ||
Pre-validation… Use Output Replace Regex | "\s","","\n","","[\d]+$","","[a-z\s\W]*","",",(\d{2})$",".$1",",","" | ||
Validate words by | Custom… | \d{0,2}(,\d{3}){1,3}) | \d)\.\d\d$ |
Profile #1: Content Adv. Search Net
Select Tab | Use this… | |||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Search Zone Size | Expand to Whole Page | |||||||||||||||||||||||||
Pre-processing | None | |||||||||||||||||||||||||
Search settings… Content Advanced… Use Find Regex | (?<=^ | \s | [ | ( | /)((((((S(ub)?) ?-? ?)?Total | Price | Amount) ?(()? ?(((not incl?(u(ding | sive))? ?(of)? ?(tax | g\.?s\.?t\.?))) | (ex(cl?(usive | uding)?[\.-]?)? ?(of)? ?(tax | g\.?s\.?t\.?))) ?())?) | ((Nett? | S(ub)? | Sales? | (I)?Ex Tax | (tax ex(cl?(usive)?[\. | amount | price)) | Nett?) ?(of)? ?[ :;,\.-]{0,4} ?(\n)? ?((? ?C[\. ]?(A[\. ]?D\.?)? ?)?)?( ?\$ | s)? ?([1-9]{1}\d{0,2}([\s,]?\d{3}){0,2} | \d{1}) ?[,\.] ?\d{2})(?=1 | |\s | \n | \$ | $) |
Skip Content… Use Skip Regex | total (in | ex)c(lu(sive | ding))? (of)? ?(G(\.)?S(\.)?T(\.)? | tax) | (in | ex)c(lu(sive | ding))? (of)? ?(G(\.)?S(\.)?T(\.)? | tax) | ||||||||||||||||||
Pre-validation… Use Output Replace Regex | "\s","","\n","","[\d]+$","","[a-z\s\W]*","",",(\d{2})$",".$1",",","" | |||||||||||||||||||||||||
Validate words by | Custom… | \d{0,2}(,\d{3}){1,3}) | \d)\.\d\d$ | ] |
Profile #2: Invoice Items Net
Select Tab | Use this… |
---|---|
Search Zone Size | Expand to Whole Page |
Pre-processing | None |
Search settings… Invoice Items | Net Total |
Skip Content | None |
Pre-validation | None |
Validate words by | Ignore |