Types of Training Suggestions

The training suggestions are calculated by analyzing the performance of the current trained model as well as the currently defined features and record pairs. Several types of suggestions are provided.

Types of Training Suggestions

Suggestion Group

Types

Description

Suggestions to Add Pairs to Specific Subsets

Adding Pairs to Subsets that Have Validation Pairs but No Training Pairs

Used for an untrained subset. It is a subset, which has not been trained with any training examples, because one or more record pairs for that subset are present only in the validation dataset.

Adding Pairs to Underrepresented Subsets

Used for underrepresented subsets. These are determined by the relatively small number of record pairs for those subsets in the training dataset.

Adding Pairs to Subsets that Have Too Few True/False Labels

Used for subsets where the percentage of the less frequent label is below a certain threshold. It is important to maintain a balance between "True" and "False" labeled pairs.

Adding Pairs to Subsets that are Found in Data File but Have No Pairs

Used for subsets that have no pairs in either dataset. These subsets are found by analyzing all records in the data table.

Suggestions to Review Existing Record Pairs

Reviewing Possibly Mislabeled Pairs

Used in scenarios where record pairs are presumably mislabeled. The labels of such pairs are different from most labels of similar pairs in the same subset. Such pairs are presented for user review.

 

Reviewing Contradictory Pairs

Used to review record pairs that have other pairs that contradict them. Contradictory pairs should be avoided for optimal training results.