Determining if a Model is Well Trained

When the Learn UI can no longer generate more specific suggestions, the Suggestions section displays that the existing pairs provide sufficient coverage. In addition, observe the reported training result of the model in the Note section on the Training tab. The training result Best iteration was found is the desired result, which means that the iteration with the lowest validation error rate was found.

One of the best ways to ensure that the model is well trained is to use the Low Confidence Pair Finder on the Pair Selection tab with a table of sufficient size until it can no longer find any new low confidence pairs to be labeled. This can address untrained and undertrained situations, even if such situations are not currently represented in Training and Validation datasets. For more information, see the section Finding Useful Pairs Automatically.

For the model to be well trained, the Training and Validation datasets should not have any pairs that contradict one another. See Reviewing Contradictory Pairs section for the method to resolve such contradictions.

When the model is well trained, the current trained model can be exported and applied for solving real-world record matching problems. See Exporting a Model section for details. You can also export the model when a training result other than "Best iteration was found" is reported if you are satisfied with the model performance statistics.