Scoring Modes

A match score is a measure of textual similarity that always has a value between 0.0 and 1.0, with 1.0 being the best possible score. If the query (or querylets) and the set of fields in the record are completely identical, this qualifies as a perfect match.

For scenarios such as:

The query matches perfectly to only a part or “phrase” of the field or record.
The record or field contains a perfect match to only a part or "phrase" of the query.

TIBCO Patterns offers the following selection of scoring modes.

The normal scoring mode (default) measures the degree to which the content of the query is found in the record. A perfect score means the query is perfectly contained in the record; the presence of extra unmatched information in the record does not lower the score. Use normal scoring for substring or keyword searches, or interactive queries where you type as little text as possible in order to locate the desired records.
The reverse scoring mode measures the degree to which the content of the record is found in the query. A perfect score means the record is perfectly contained in the query; the presence of extra unmatched information in the query does not lower the score. (The reverse scoring mode, in other words, “reverses” the sense of the normal score.) Use reverse scoring for retrieving lists of standard keywords, locations or names that are embedded within a body of text used as the query.
The symmetric scoring mode measures the degree to which the query and the record are identical (or “contained in each other”). The score is lowered if either the query contains information not present in the record, or the record contains information not present in the query. Use this score for record-to-record or field-to-field comparisons, when the query represents the entirety of the text expected to be found in the record, and vice versa.
The minimum scoring mode is the minimum of the normal and reverse scores. In some situations of record-to-record comparison, this might be a better indicator of match quality than the symmetric score. It puts a higher penalty on extra information in either the record or the query than symmetric scoring would.
The maximum scoring mode is the maximum of the normal and reverse scores. Use this scoring mode in the relatively rare situation where either the query or the record might contain “extraneous” information, and you do not wish the score penalized in either event.

The type of the scoring mode is based on whether the generated score is the same when the value of the query and the value in the record are swapped:

Symmetrical type: the score for the swapped values is identical. These are the symmetric, minimum, and maximum scoring modes. Only scoring modes of the symmetrical type are used in the Machine Learning Platform to train the Learn models. These modes are also used in most cases in deduplication applications that use the Deduplication Framework.
Asymmetrical type: the score for the swapped values is different. These are the normal and reverse scoring modes. They must not be used with Learn models, and should not be used in deduplication applications.

Note that Normal and Reverse scoring is not a simple “contained in” comparison. The score considers factors such as tokenization (splitting text into words) and relative position. For example, a query for "cat" against a record value of "category" does not yield a perfect score even though "cat" is entirely contained in the record value because the query has "cat" as a separate token, the record does not. On the other hand, a record value of "dog and cat fight" would get a perfect score.

The scoring mode is selectable independently for each component of a complex query. In general, symmetric scoring is appropriate for cognate queries, whereas any of the scoring modes might be appropriate for simple queries, depending on the scenario.

For other details concerning the interpretation of match scores, such as score thresholds, score cutoffs, and tie breaking, see Interpreting and Handling Patterns Output.