Skip to main content
Skip table of contents

Configuring AI Powered Document Classification in WebApps

1. Overview

The Document Classification Assistant (DCA) is a tool designed to classify documents specific to customer requirements. It enhances document management within your existing web applications by enabling automated sorting and categorization based on custom-trained models. This assistant can be integrated into your system seamlessly, building on the customer's specific document types.

2. Integration and Setup

2.1 Installation

The DCA is installed as part of the Web Apps Install package but is licensed separately from RIA. Initially, it operates in a minimal state ("brain dead") until it is configured and trained on the customer’s specific document types.

2.2 Configuration

  1. Go to the App Pages Tab and Select App Page from the green drop-down button → Select DCAPage - Document Classification Assistant

    Screenshot 2024-10-21 at 2.46.13 pm.png

  1. Give your DCA page a Name and Description.

    Screenshot 2024-10-21 at 2.48.47 pm.png

  2. To add a classification type, go to the categories tab and Select + Add.

    Screenshot 2024-10-21 at 2.50.54 pm.png

     

  3. Name the Classification type (e.g fallen tree)

Screenshot 2024-10-21 at 2.54.02 pm.png
  1. Under Queue Selection you will need to select the Page you would like to send classified documents to. You have three options:

Choice

Explanation

Simple

Allows you to select a pre-configured RIA Page from the drop-down menu

Manual

Allows you to manually enter either a page name or ID

Ignore

Allows you to not select a page.

  1. You must then select a Queue Type from the following:

Choice

Explanation

Simple

Allows you to select a queue type from the drop-down menu

Manual

Allows you to manually enter either a queue type name or ID

Ignore

Allows you to not select a queue type

  1. You must then select a Queue from the following:

Choice

Explanation

Simple

Allows you to select a queue from the drop-down menu

Manual

Allows you to manually enter either a queue name or ID

Ignore

Allows you to not select a queue

  1. To add another Classification type Select + Add. It is recommended that you start training with at least three different classification types.

  2. On the schedule tab, you can select whether you would like to train automatically with schedule. This means the engine will automatically train with processed documents each night at midnight.

Screenshot 2024-10-21 at 3.08.51 pm.png
  1. Save your Engine.

Screenshot 2024-10-21 at 3.17.15 pm.png

 

Tip: you can select the days to schedule the engine to train by going to the scheduling tab on the Admin Toolbar. Click the toggle on to modify the schedules training, select your training days and training time, and then click Save

Screenshot 2024-10-21 at 3.13.57 pm-20241021-051436.png

 

2.3 Training Your Engine

  1. To open up your DCA engine, go to your home page and click on the Classifier Tile.

image-20241021-051856.png
  1. This will take you to a screen where you can train and test the engine on any classification categories you have created. The numbers on the left of the table represent the number of documents trained for that classification category (currently 0).

Screenshot 2024-10-21 at 3.22.48 pm.png
  1. To train the engine select the Classify Button.

Screenshot 2024-10-21 at 3.36.55 pm.png
  1. Select a category from the drop-down menu and then select upload to upload a collection of sample documents that you want to train the engine to identify as the chosen category (In this example 20 different complaint letters about fallen trees were uploaded). Select Classify.

    Screenshot 2024-10-21 at 3.38.17 pm.png

     

  2. whilst the table will now show the number of documents classified, you will need to select “Rebuild” to actually train the engine on these documents.

Screenshot 2024-10-21 at 3.40.56 pm.png

Note: You will not be able to train or “Rebuild” the engine until you have classified at least two different classification categories.

 

  1. Once you have rebuilt the engine you can test your classification engine on a document by selecting “Predict” and uploading a document.

    Screenshot 2024-10-21 at 3.50.13 pm.png

  2. The engine will return a score on how confident it is that the uploaded document is each of the classification categories. Click the “x” to return to the training screen.

    Screenshot 2024-10-21 at 3.50.43 pm.png

     

2.4 Linking Your DCA Page to a RIA Sorting Page

  1. Now that you've made your DCA engine you will need to create a new RIA Page to feed the documents to for classification. It is recommended to name this RIA Page to include the word “Sorter”.

Screenshot 2024-10-21 at 3.58.42 pm.png
  1. Add a Queue Type. Click Add Queue Type, Select Shared or Personal, give your Queue a name and select + Add Queue.

Screenshot 2024-10-21 at 3.59.46 pm.png
  1. Go to the Pre-Processing tab and from the + Create New drop-down menu select Classify Document.

Screenshot 2024-10-21 at 4.02.28 pm.png
  1. Give the Stage a Name and Description.

    Screenshot 2024-10-21 at 4.05.22 pm.png

    .

  2. Scroll down to the Stage Settings Section and from the Action Drop Down menu select Predict.

Screenshot 2024-10-21 at 4.12.11 pm.png
  1. Select your DCA engine from the drop-down menu.

Screenshot 2024-10-21 at 4.13.04 pm.png
  1. Enter a Minimum Confidence level and Confidence Difference Level.

Field

Description

Example

Minimum Confidence Level

The minimum level of confidence a document must be identified as (doesn't matter what other percentages are)

80

Confidence Difference Level

The difference between Confidence Levels required for the most confident classification to be considered valid.

60

  1. If you would like documents that meet the above confidence levels to be automatically routed to the correct RIA queue turn on Route successful prediction. Documents that don't meet these requirements will remain in the sorter for manual sorting.

  2. Click Apply.

  3. Go to the Fields Tab and add a field called Classification Info.

Screenshot 2024-10-22 at 10.20.23 am.png

Note: When creating your pre-processor before the field called Target Metadata ID is called classification by default. This means this field will call in the Confidence Levels for the Classification Categories when a document is stuck in the sorter.

Screenshot 2024-10-21 at 4.22.34 pm.png

 

  1. Before using the sorter you will need to add a route button to manually route documents that havent reached the confidence level. Go to the Actions Tab, the + Create New drop-down and select From Template.

    Screenshot 2024-10-22 at 9.01.43 am.png

     

  2. Select Route from the list of actions.

Screenshot 2024-10-22 at 9.04.13 am.png
  1. Go to the Frontend Tab and select an option from the Page Restrictions drop-down.

Screenshot 2024-10-22 at 9.06.31 am.png

Action

Description

This Page

Users will only be able to route documents to the current page

Select Pages

This is set to All pages by default. To select custom pages click the All Pages toggle off and select pages individually from the drop-down menu. Selected pages will appear as tags below the field. To remove one click the x.

Screenshot 2024-10-22 at 9.12.31 am.png

Pages in current app group

Users can only route documents to pages in the app group that the sorter is part of.

Pages the user can access

Users will only be able to route documents to Queues they can access

Pages in a specific App Page Group

Users can route documents to pages in a specific app group

2.5 Using the Sorter

 

  1. Any Documents that arrive in the council sorter will be sent to DCA for Classification. If the document meets the configured confidence level requirements, it will be automatically routed to the correct workflow. If the document does not meet the requirements it will remain in the Queue for human intervention.

Screenshot 2024-10-22 at 1.16.54 pm.png
  1. When you view the document you will be able to see what the engine has predicted the file as. To send the document to the correct workflow Click Route.

Screenshot 2024-10-22 at 1.17.46 pm.png
  1. Select the correct page and Queue and then select Route. The document will then appear in that queue for Indexing.

Screenshot 2024-10-22 at 1.18.59 pm.png
  1. Is it possible to setup the software to train on new documents coming in?

Yes, with some modifications to the setup:

  • Make it so that when we submit a document in a queue it will add to the training data

  • Add a stage to the RIA Page to submit documents - and classify documents - select the classification page & category

  • When a doc is submitted it sends it to the engine to say this is what this type of document looks like

  • In the admin screen - it won't be processed by the engine until the training runs at night but we can go in and “rebuild the model” - will have a green tick if processed - can reclassify or delete incorrect documents

 

 


3. Email Imports

In version 3.11 or higher, functionality has been added to hook up a mailbox to import documents to a RIA page.

3.1 Setting Up an Email Mailbox

  1. Go to the Connections Management Tab on the Admin Menu, External Connection Tab and Select + Add.

    Screenshot 2024-08-09 at 3.25.27 pm.png

     

  2. Select Microsoft Graph from the drop down.

    Screenshot 2024-08-09 at 3.29.31 pm.png

  3. Give the Mailbox Connection a name & description, select Inbuilt from Authentication Type and Select the Authorise Button.

    Screenshot 2024-08-09 at 3.31.56 pm.png

 

  1. If the authorization is successful you will get a confirmation message.

  2. If you get a ‘Pending Authorization’ tag you will need to log a support ticket to get the authorization issue sorted.

    Screenshot 2024-08-09 at 3.37.49 pm.png

3.2 Adding Email Import to your Document Sorter

  1. To add emails to your document sorter, go to your document sorter page > Imports > Create New

Screenshot 2024-08-09 at 3.42.42 pm.png
  1. Select Microsoft Email Import from drop-down menu

Screenshot 2024-08-09 at 3.43.46 pm.png
  1. You can set a frequency for the import to run, this is set to 1 by default.

Screenshot 2024-08-09 at 3.44.48 pm.png
  1. On the Import tab, Select the External Connection you created from the drop-down menu.

image-20240809-055330.png
  1. Enter the Email Address for the mailbox or shared mailbox you wish to download items from.

    Screenshot 2024-10-22 at 10.49.52 am.png

  2. Select an email source folder to import from look-up.

Screenshot 2024-10-22 at 10.49.10 am.png
  1. Select a Clean up method

 

 

Delete File

The file will be deleted from the inbox once it has been imported.

Move Email

The email will be moved to the destination once it has been imported. Select an email source folder to move the processed emails to.

  1. On the Output Tab Select the queue type and queue you want to output the emails to. Select Save.

Choice

Explanation

Simple

Allows you to select a queue/ queue type from the drop-down menu

Manual

Allows you to manually enter either a queue/ queue type name or ID

Ignore

Allows you to not select a queue/ queue type

  1. For the emails to be classified they will need to be converted from eml files to PDFs via a pre-processor. Go to the Pre-Processing tab, Select + Create New and Select File Converter.

Screenshot 2024-10-22 at 10.58.19 am.png

 

 

  1. In the Stage Setting, ensure Convert Emails is toggled on. Click Apply.

Screenshot 2024-10-22 at 1.02.12 pm.png
  1. Check that the Convert Files pre-processor is the first on the list so it runs before the classification.

Screenshot 2024-10-22 at 1.04.25 pm.png

 

 

 

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.