Checking Duplicates

You can check duplicates against a project or an external table uploaded from TIBCO Patterns.

Note: For the enterprise edition, you must set up a connection to TIBCO Patterns server before using the Dedup function, see Configuring Patterns Server Settings.

Procedure

  1. On the project page, click Dedup.
  2. Optional: To check the duplicates against an external table uploaded from TIBCO Patterns:
    1. Select the Validate against external tables check box.
    2. From the following list, select a table from the list. If no table is available, click Manage table list from the list to create one.
      For more information, see Managing External Tables.
    3. Map the external table columns to the project columns automatically or manually. Click Auto map for automatic mapping or drag columns for manual mapping.
    4. Move the Matches requested slider to specify the number of duplicates to be returned.
  3. Optional: To search data in a TIBCO Patterns table according to a keyword:
    1. Click Keyword search.
    2. From the Tables list, select a table.
    3. Enter a keyword in the search box, move the Score threshold slider to set the matching accuracy, and then click Search.
      The search results are displayed.

    4. Click Close to exit.
  4. Move the Score threshold slider to set the accuracy of the query.
  5. Optional: To group several data columns to detect duplicates, create some switchable groups in the Column configuration area:
    1. From the menu next to Column name, click Create a switchable group.
    2. Select the check boxes before the column names to be grouped, and click elsewhere to exit your selection.
      A switchable group is added. To create more switchable groups, repeat the operations.
    You can also remove the switchable groups:
    • To remove a switchable group, click before the switchable group.
    • To remove all switchable group, click Remove all switchable groups from the menu next to Column name.
  6. Optional: In the Column configuration area, select the check boxes next to the columns and switchable groups ( if there are some) where duplicates are checked.
  7. Optional: Configure dedup factors for the selected columns and switchable groups (if there are some).
  8. Click Run to start checking the duplicates.
    When the duplicate checking is completed, you are directed to the project data page. Dedup results are displayed in the new added dedup columns, as described in Dedup Results.