Creating a Pattern

Before You Start

The Scanned Document Processor service must be running to enable scanned documents to be processed. See Service Settings for information about how to start this service.

The Transform Client Plug-In must be installed to view the scanned document and edit a pattern.

To submit a sample scanned document for use in the following examples, it is necessary for a branch file developer to create and run a branch that uses the OCR Store Scanned Document object to submit a sample file to the OCR engine for scanning. The following examples use the sample invoice file, <root>\DOL.fsr\DOL\help\Sample_Invoice.pdf.

The branch can be configured in any of the following ways:

Starting the Pattern Editing Application

The route to the pattern editing application depends on the history of the document to be used as a template:

Creating a Pattern

Note

It is possible for a document to match more than one pattern or index search. Because the OCR Pack does not check against patterns and index searches in the same order each time it processes a document, it is possible for similar documents to be matched to different patterns or index searches. Be careful to create patterns or index searches that discriminate correctly between different types of document.

When you click the Edit Pattern button () to open the pattern editing application, the page displays the controls for creating or editing a pattern, and the left pane displays the following buttons:

By default, the controls for Index Data are displayed first.

For each normal index associated with the category, a red box is displayed as an overlay on the scanned image, and labeled with the name of the index.

In the example shown, boxes are displayed for the Invoice Date and the Invoice Number indexes created as described in Creating a Scanning Category.

To create a pattern:

  1. Drag and resize the red overlay boxes so that they enclose the area containing the relevant data.

    This is an example of static text on a page. Every time this pattern is matched, it extracts the data at these two locations for the category's two index fields.

  2. Click Identify Pattern.

    Identify Pattern provides a single box named Anchor. Drag and resize this box to surround text that appears in the same position on this type of document but not on any other documents.

    In this example, the pattern is to identify invoices from a specific supplier. Position the box around the name of the supplier, as shown below.

    Whenever this text appears on this position on a page, the pattern will match.

  3. Enter a Pattern Name.

  4. Click Finish to save the pattern.
  5. A dialog appears, asking whether you wish to check the pattern against existing samples. Click No.

Testing the Pattern

Scanned documents of the same format as the template document should match the pattern and be assigned to the associated category.

To test the pattern, resubmit the sample document for processing. If the pattern is configured correctly, the document appears in the OCR_Demo category.

When you open the document, blue overlay boxes show the data elements that have been extracted as the index values for the document. Options are available for changing the index values and for editing the pattern.

Tidying Up

After a pattern has been created, the document remains in the Pattern_Requests category. We recommend that when you have finished creating a pattern, you delete the document from Pattern_Requests.

Note

For information about the keyboard actions that are available when you view a document that uses the Transform client plug-in feature, see OCR Keyboard Shortcuts.